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1 

HCV flEKQMIC SEQUENCES FOR 
DIAGNOSTICS MUD THERAPEUTICS 

This application is a continuation-in-part of U.S. 
5 Serial Ho. 07/697,326 entitled "Polynucleotide Probes 
Useful for Screening for Hepatitis C Virus, filed May 
8, 1991. 

Teehnieal Field 

10 The invention relates to compositions and methods 

for the detection and treatment of hepatitis C virus, 
(HCV) infection, formerly referred to as blood-borne 
non-A, non-B hepatitis virus (NANBV) infection. More 
specifically, embodiments of the present invention 

15 feature compositions and methods for the detection of 
HCV, and for the development of vaccines for the 
prophylactic treatment of infections of HCV, and 
development of antibody products for conveying passive 
immunity to HCV. 

20 

Batikqround of the Invention 

The prototype isolate of HCV was characterized in 
U.S. Patent Application Serial No. 122,714 (See also 
EPO Publication No. 318,216). As used herein, the term 
25 "HCV" includes new isolates of the same viral species. 
The term "HCV-1" referred to in U.S. Patent Application 
Serial No. 122,714. 
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HCV is a transmissible disease distinguishable 
from other forms of viral-associated liver diseases, 
including that caused by the known hepatitis viruses, 
i.e., hepatitis A virus (HAV), hepatitis B virus (HBV), 

5 and delta hepatitis virus (HDV) , as well as the 
hepatitis induced by cytomegalovirus (CMV) or 
Epstein-Barr virus (EBV) . HCV was first identified in 
individuals who had received blood transfusions. 

The demand for sensitive, specific methods for 

10 screening and identifying carriers of HCV and HCV 
contaminated blood or blood products is significant. 
Post-transfusion hepatitis (PTH> occurs in 
approximately 10% of transfused patients, and HCV 
accounts for up to 90% of these cases. The disease 

15 frequently progresses to chronic liver damage (25-55%). 
Patient care as well as the prevention of 
transmission of HCV by blood and blood products or by 
close personal contact require reliable screening, 
diagnostic and prognostic tools to detect nucleic 

20 acids, antigens and antibodies related to HCV. 

Information in this application suggests the HCV 
has several genotypes. That is, the genetic 
information of the HCV virus may not be totally 
identical for all HCV, but encompasses groups with 

25 differing genetic information. 

Genetic information is stored in thread-like 
molecules of DNA and KNA. DNA consists of covalently 
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linked chains of deoxyribonucleotides and KNA consists 
of covalently linked chains of ribonucleotides. Each 
nucleotide is characterized by one of four bases: 
adenine (A), guanine (0), thymine (T), and cytosine 
5 (C)« The bases are complementary in the sense that, 
due to the orientation of functional groups, certain 
base pairs attract and bond to each other through 
hydrogen bonding and ir-stacking interactions. 
Adenine in one strand of DNA pairs with thymine in an 

10 opposing complementary strand. Guanine in one strand 
of DNA pairs with cytosine in an opposing complementary 
strand. In KNA, the thymine base is replaced by uracil 
(U) which pairs with adenine in an opposing 
complementary strand. The genetic code of living 

15 organism is carried in the sequence of base pairs. 
Living cells interpret, transcribe and translate the 
information of nucleic acid to make proteins and 
peptides . 

The HCV genome is comprised of a single positive 
20 strand of KNA. The HCV genome possesses a continuous, 
translational open reading frame (ORF) that encodes a 
polyprotein of about 3,000 amino acids. In the OSF, 
the structural protein(s) appear to be encoded in 
approximately the first cpaarter of the N-terminus 
25 region, with the majority of the polyprotein 
responsible for non-structural proteins. 
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The HCV polyprotein comprises, from the amino 
terminus to the carboxy terminus, the nucleocapsid 
protein (C), the envelope protein (E), and the 
non-Btructural proteins (NS) 1, 2 (b), 3, 4 (b), and 5. 
5 HCV of differing genotypes may encode for proteins 

which present an altered response to host immune 
systems. HCV of differing genotypes may be difficult 
to detect by immuno diagnostic techniques and nucleic 
acid probe techniques which are not specifically 

10 directed to such genotype. 

Definitions for selected terms used in the 
application are set forth below to facilitate an 
understanding of the invention. The term 
"corresponding" means homologous to or complementary to 

15 a particular sequence of nucleic acid. As between 
nucleic acids and peptides, corresponding refers to 
amino acids of a peptide in an order derived from the 
sequence of a nucleic acid or its complement. 

The term "non-natural ly occurring nucleic acid" 

20 refers to a portion of genomic nucleic acid, cDNA, 

semisynthetic nucleic acid, or synthetic origin nucleic 
acid which, by virtue of its origin or manipulation: 
(1) is not associated with all of a nucleic acid with 
which it is associated in nature, (2) is linked to a 

25 nucleic acid or other chemical agent other than that to 
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which it is linked in nature, or (3) does not occur in 
nature • 

Similarly the term, "a non-naturally occurring 
peptide" refers to a portion of a large naturally 
5 occurring peptide or protein, or semi-*synthetic or 
synthetic peptide, which by virtue of its origin or 
manipulation (1) is not associated with all of a 
peptide with which it is associated in nature, (2) is 
linked to peptides, functional groups or chemical 

10 agents other than that to which it is linked in nature, 
or (3) does not occur in nature. 

The term "primer" refers to a nucleic acid which 
is capable of initiating the synthesis of a larger 
nucleic acid when placed under appropriate conditions. 

15 The primer will be completely or substantially 

con^lementary to a region of the nucleic acid to be 
copied. Thus, under conditions conducive to 
hybridization, the primer will anneal to a 
complementary region of a larger nucleic acid. Upon 

20 addition of suitable reactants, the primer is extended 
by the polymerizing agent to form a copy of the larger 
nucleic acid. 

The term "binding pair" refers to any pair of 
molecules which exhibit mutual affinity or binding 

25 capacity. For the purposes of the present application, 
the term "ligand" will refer to one molecule of the ^ 
binding pair, and the term "antiligand" or "receptor" 
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or "target" will refer to the opposite molecule of the 
binding pair. Tor example, with respect to nucleic 
acids, a binding pair may comprise two complementary 
nucleic acids. One of the nucleic acids may be 
designated the ligand and the other strand is 
designated the antiligand receptor or target. The 
designation of ligand or antiligand is a matter of 
arbitrary convenience. Other binding pairs comprise, 
by way of example, antigens and antibodies, drugs and 
drug receptor sites and enzymes and enzyme substrates, 

to name a few. 

The term "label" refers to a molecular moiety 
capable of detection including, by way of example, 
without limitation, radioactive isotopes, enzymes, 
luminescent agents, precipitating agents, and dyes. 

The term "support" includes conventional supports 
such as filters and membranes as well as retrievable 
supports which can be substantially dispersed within a 
medium and removed or separated from the medium by 
immobilization, filtering, partitioning, or the like. 
The term "support means" refers to supports capable of 
being associated to nucleic acids, peptides or 
antibodies by binding partners, or covalent or 
noncovalent linkages. 

A number of HCV strains and isolates have been 
identified. When compared with the sequence of the 
original isolate derived from the USA ("HCV-1"; see 
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Q.-L. Choo et al. (1989) Science 244:359-362, Q.-L. 
Choo et al. (1990) Brit. Med. Bull. 46:423-441, Q.-L. 
Choo et al., Proc. Natl. Aead. Sci. 88:2451-2455 
(1991), and E. P.O. Patent Publication No. 318,216, 
cited Bu-pra ) , it was found that a Japanese isolate 
("HCV Jl") differed significantly in both nucleotide 
and polypeptide sequence within the NS3 and NS4 
regions. This conclusion was later extended to the NS5 
and «ivelope (El/S and E2/NS1) riegions ( see K. Takeuchi 
et al., J. Gen. Virol. (1990) 71:3027-3033, Y. Kubo, 
Nucl. Acids. Res. (1989) 17:10367-10372, and K. 
Takeuchi et al.. Gene (1990) 91:287-291). The former 
group of isolates, originally identified in the United 
States, is termed "Genotype I" throughout the present 
disclosure, while the latter group of isolates, 
initially identified in Japan, is termed "Genotype II" 
herein. 

Brief Description of the Invention 

The present invention features compositions of 
matter comprising nucleic acids and peptides 
corresponding to the HCV viral genome which define 
different genotypes. The present invention also 
features methods of using the compositions 
corresponding to sequences of the HCV viral genome 
which define different genotypes described herein. 
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A. Nucleic acid compositions 
The nucleic acid of the present invention, 
corresponding to the HCV viral genome vhich define 
different genotypes, have utility as probes in nucleic 
5 acid hybridization assays, as primers for reactions 
involving the synthesis of nucleic acid, as binding 
partners for separating HCV viral nucleic acid from 
other constituents which may be present, and as 
anti-sense nucleic acid for preventing the 
10 transcription or translation of viral nucleic acid. 

One embodiment of the present invention features a 
composition comprising a non-naturally occurring 
nucleic acid having a nucleic acid sequence of at least 
eight nucleotides corresponding to a non-HCV-1 
15 nucleotide sequence of the hepatitis C viral genome. 
Preferably, the nucleotide sequence is selected from a 
sequence present in at least one region consisting of 
the NS5 region, envelope 1 region, 5'UT region, and the 
core region. 

20 Preferably, with respect to sequences which 

correspond to the NS5 region, the sequence is selected 
from a sequence within a sequence numbered 2-22. The 
sequence numbered 1 corresponds to HCV-1. Sequences 
numbered 1-22 are defined in the Sequence Listing of 

25 the application. 

Preferably, with respect to sequences 
corresponding to the envelope 1 region, the sequence is 
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selected from a sec[uence within sequences nximbered 
24-32. Sequence No. 23 corresponds to HCV-1. 
Sequences ntunbered 23-32 are set forth in the Sequence 
Listing of the application. 

Preferably, with respect to the sequences which 
correspond to the S'UT regions r the sequence is 
selected from a sequence within sequences niimbered 
34-51. Sequence Ko. 33 corresponds to HCV-l. Sequence 
No. 33-51 are set forth in the Sequence Listing of this 
application. 

Preferably, with respect to the sequences which 
correspond to the core region, the sequence is selected 
from a sequence within the sequences numbered 53-66. 
Sequence No. 52 corresponds to HCV-l. Sequences 52-66 
are set forth in the Sequence Listing of this 
application. 

The compositions of the present invention form 
hybridization products with nucleic acid corresponding 
to different genotypes of HCV. 

HCV has at least five genotypes, which will be 
referred to in this application by the designations 
GI-6V. The first genotype, GI, is exemplified by 
sequences numbered 1-6, 23-25, 33-38 and 52-57. The 
second genotype, GII, is exemplified by the sec[uences 
numbered 7-12, 26-28, 39-45 and 58-64. The third 
genotype, GUI, is exemplified by sequences numbered 
13-17, 32, 46-47 and 65-66. The fourth genotype, GIV, 
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is exemplified by seguences numbered 20-22, and 29-31 
and 48-49. The fifth genotype, GV, is exemplified by 
seguences numbered 18, 19, 50 and 51. 

One embodiment of the present invention features 
5 compositions comprising a nucleic acid having a 

seguence corresponding to one or more sequences which 
exemplify a genotype of HCV. 

B. Method of forming a Hvh r^t^igation Product 
10 Embodiments of the present invention also feature 

a method of forming a hybridization product with 
nucleic acid having a seguence corresponding to HCV 
nucleic acid. One method comprises the steps of 
placing a non-naturally occurring nucleic acid having a 
15 non-HCV-l seguence corresponding to HCV nucleic acid 
under conditions in which hybridization may occur. The 
non-naturally occurring nucleic acid is capable of 
forming a hybridization product with HCV nucleic acid, 
under hybridization conditions. The method further 
20 comprises the step of imposing hybridization conditions 
to form a hybridization product in the presence of 
nucleic acid corresponding to a region of the HCV 
genome. 

The formation of a hybridization product has 
25 utility for detecting the presence of one or more 
genotypes of HCV. Preferably, the non-naturally 
occurring nucleic acid forms .a hybridization product 
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vith nucleic acid of HCV in one or more regions 
comprising the NS5 region, envelope 1 region, 5'DT 
region and the core region. To detect the 
hybridization product, it is useful to associate the 
non-naturally occurring nucleic acid vith a label. The 
formation of the hybridization product is detected by 
separating the hybridization product from labeled 
non-naturally occurring nucleic acid, which has not 
formed a hybridization product. 

The formation of a hybridization product has 
utility as a means of separating one or more genotypes 
of HCV nucleic acid from other constituents potentially 
present. For such applications, it is useful to 
associate the non-naturally occurring nucleic acid vith 
a support for separating the resultant hybridization 
product from the the other constituents. 

Nucleic acid "sandwich assays" employ one nucleic 
acid associated with a label and a second nucleic acid 
associated with a support. An embodiment of the 
present invention features a sandwich assay comprising 
two nucleic acids, both have sequences which correspond 
to HCV nucleic acids; however, at least one 
non-naturally occurring nucleic acid has a sequence 
corresponding to non-HCV-l HCV nucleic acid. At least 
one nucleic acid is capable of associating with a 
label, and the other is capable of associating with a 
support. The support associated non-naturally 
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occurring nucleic acid is used to separate the 
hybridization products which include an HCV nucleic 
acid and the non-naturally occurring nucleic acid 
having a non-HCV-1 sequence . 
5 One embodiment of the present invention features a 

method of detecting one or more genotypes of HCV. The 
method comprises the steps of placing a non-naturally 
occurring nucleic acid under conditions which 
hybridization may occur. The non-naturally occurring 

10 nucleic acid is capable of forming a hybridization 

product with nucleic acid from one or more genotypes of 
HCV. The first genotype, 61, is exen^lified by 
sequences numbered 1-6, 23-25, 33-38 and 52-57. The 
second genotype. GII, is exemplified by the sequences 

15 numbered 7-12, 26-28, 39-45 and 58-64. The third 
genotype, GUI, is exemplified by sequences numbered 
13-17, 32, 46-47 and 65-66. The fourth genotype, 6IV, 
is exemplified sequences numbered 20-22 and 29-31. The 
fifth genotype, 6V, is exemplified by sequences 

20 numbered 18, 19, 50 and 51. 

The hybridization product of HCV nucleic acid with 
a non-naturally occurring nucleic acid having non-HCV-1 
sequence corresponding to sequences within the HCV 
genome has utility for priming a reaction for the 

25 synthesis of nucleic acid. 

The hybridization product of HCV nucleic acid with 
a non-naturally occurring nucleic acid having a 
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sequence corresponding to a particular genotype of HCV 
has utility for priming a reaction for the synthesis of 
nucleic acid of such genotype. In one embodiment, the 
synthesized nucleic acid is indicative of the presence 
of one or more genotypes of HCV. 

The synthesis of nucleic acid may also facilitate 
cloning of the nucleic acid into expression vectors 
which synthesize viral proteins. 

Embodiments of the present methods have utility as 
anti-sense agents for preventing the transcription or 
translation of viral nucleic acid. The formation of a 
hybridization product of a non-naturally occurring 
nucleic acid having sequences which correspond to a 
particular genotype of HCV genomic sequencing with HCV 
nucleic acid may block translation or transcription of 
such genotype. Therapeutic agents can be engineered to 
include all five genotypes for inclusivity. 

C. Peptide and antibody composition 

A further embodiment of the present invention 
features a composition of matter comprising a 
non-naturally occurring peptide of three or more amino 
acids corresponding to a nucleic acid having a 
non-HCV-1 sequence. Preferably, the non-HCV-l sequence 
corresponds with a sequence within one or more regions 
consisting of the NS5 region, the envelope 1 region, 
the 5'UT region, and^ the core region. 
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preferably, with respect to peptides corresponding 
to a nucleic acid having a non-HCV-1 sequence of the 
H65 region, the sequence is within sequences numbered 
2-22. The sequence numbered 1 corresponds to HCV-l. 
sequences numbered 1-22 are set forth in the Sequence 
Listing. 

Preferably, with respect to peptides corresponding 
to a nucleic acid having a non-HCV-l sequence of the 
envelope 1 region, the sequence is within sequences 
numbered 24-32. The sequence numbered 23 corresponds 
to HCV-l. Sequences numbered 23-32 are set forth in 
the Sequence Listing. 

Preferably, with respect to peptides corresponding 
to a nucleic acid having a non-HCV-1 sequence directed 
to the core region, the sequence is within sequences 
numbered 53-66. Sequence numbered 52 corresponds to 
HCV-l. Sequences numbered 52-66 are set forth in the 
Sequaice Listing. 

The further embodiment of the present invention 
features peptide compositions corresponding to nucleic 
acid sequences of a genotype of HCV. The first 
genotype, GI, is exemplified by sequences numbered 1-6, 
23-25, 33-38 and 52-57. The second genotype, GII, is 
exemplified by the sequences numbered 7-12, 26-28, 
39-45 and 58-64. The third genotype, GUI, is 
exemplified by sequences numbered 13-17, 32, 46-47 and 
65-66. The fourth genotype, GIV, is exemplified 
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sequences niunbered 20-22, 29-31, 48 and 49. The fifth 
genotype, 6V, is exemplified by sequences numbered 18, 
19, 50 and 51. 

The non-naturally occurring peptides of the 
present invention are useful as a component of a 
vaccine. The sequence information of the present 
invention permits the design of vaccines which are 
inclusive for all or some of the different genotypes of 
HCV. Directing a vaccine to a particular genotype 
allows prophylactic treatment to be tailored to 
maximize the protection to those agents likely to be 
encountered. Directing a vaccine to more than one 
genotype allows the vaccine to be more inclusive. 

The peptide compositions are also useful for the 
development of specific antibodies to the HCV 
proteins. One embodiment of the present invention 
features as a composition of matter, an antibody to 
peptides corresponding to a non-HCV-l sequence of the 
HCV genome. Preferably, the non-HCV-l sequence is 
selected from the sequence within a region consisting 
of the NS5 region, the envelope 1 region, and the core 
region. There are no peptides associated with the 
untranslated 5'UT region. 

Preferably, with respect to antibodies directed to 
peptides of the NS5 region, the peptide corresponds to 
a sequence within sequences nxunbered 2-22. Preferably, 
with respect to antibodies directed to a peptide 



» 
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corresponding to the envelope 1 region, the peptide 
corresponds to a sequence within sequences numbered 
24-32. Preferably, with respect to the antibodies 
directed to peptides corresponding to the core region, 
5 the peptide corresponds to a sequence within sequences 
numbered 53-66. 

Antibodies directed to peptides which reflect a 
particular genotype have utility for the detection of 
such genotypes of HCV and -therapeutic agents.. 

10 One embodiment of the present invention features 

an antibody directed to a peptide corresponding to 
nucleic acid having sequences of a particular 
genotype. The first genotype, GI, is exemplified by 
sequences numbered 1-6, 23-25, 33-38 and 52-57. The 

15 second genotype, 6II, is exemplified by the sequences 
numbered 7-12, 26-28, 39-45 and 58-64. The third 
genotype, GUI, is exemplified by sequences numbered 
13-17, 32, 46-47 and 65-66. The fourth genotype, 6IV, 
is exen5)lified sequences numbered 20-22, 29-31, 48 and 

20 49. The fifth genotype, GV, is exemplified by 
sequences numbered 18, 19, 50 and 51. 

Individuals skilled in the art will readily 
recognize that the compositions of the present 
invention can be packaged with instructions for use in 

25 the form of a kit for performing nucleic acid 
hybridizations or immtinochemical reactions. 
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The present invention is further described in the 
following figures which illustrate sequences 
demonstrating genotypes of HCV. The sequences are 
designated by numerals 1-145# which numerals and 
5 sequences are consistent with the numerals and 

sequences set forth in the Sequence Listing. Sequences 
146 and 147 facilitate the discussion of an assay which 
numerals and sequences are consistent with the numerals 
and sequences set forth in- the Sequence Listing. 

10 

Brief Description of the Figures and Sequence Listing 

Figure 1 depicts schematically the genetic 
organization of HCV; 

Figure 2 sets forth nucleic acid sequences 
15 numbered 1-22 which sequences are derived from the NS5 
region of the HCV viral genome; 

Figure 3 sets forth nucleic acid sequences 
numbered 23-32 which sequences are derived from the 
envelope 1 region of the HCV viral genome; 
20 Figure 4 sets forth nucleic acid sequences 

numbered 33-51 which sequences are derived from the 
5'UT region of the HCV viral genome; and. 

Figure 5 sets forth nucleic acid sequences 
numbered 52-66 which sequences are derived from the 
25 core region of the HCV viral genome. 

The Sequence Listing sets forth the sequences of 
sequences numbered 1-147. 
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Detailed Description of the Invention 

The present invention will be described in detail 
as as nucleic acid having sequences corresponding to 
the HCV genome and related peptides and binding 
5 partners, for diagnostic and therapeutic applications. 

The practice of the present invention will employ/ 
unless otherwise indicated, conventional techniques of 
chemistry, molecular biology, microbiology, recombinant 
DNA, and immunology, which are within the skill of the 

10 art. Such techniques are explained fully in the 
literature. See e.g., Maniatis, Fitsch 6 Sambrook, 
Molecular Cloning; A Laboratory Manual (1982); DNA 
Cloning, Volumes I and II (D.N Glover ed. 1985); 
Oligonucleotide Synthesis (M.J. Gait ed, 1984); Nucleic 

15 Acid Hybridization (B.D. Hames 6 S.J. Higgins eds. 
1984); the series. Methods in Enzymology (Academic 
Press. Inc.), particularly Vol. 154 and Vol. 155 (Wu 
and Grossman, eds.). 

The cDNA libraries are derived from nucleic acid 

20 sequences present in the plasma of an HCV-infected 
chimpanzee. The construction of one of these 
libraries, the "c" library (ATCC No. 40394), is 
described in PCT Pub. No. W090/14436. The sequences of 
the library relevant to the present invention are set 

25 forth herein as sequence numbers 1, 23, 33 and 52. 
Nucleic acids isolated or synthesized in 
accordance with features of the present invention are 
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useful, by way of example without limitation as probes, 
primers / anti-sense genes and for developing expression 
systems for the synthesis of peptides corresponding to 
such sequences. 
5 The nucleic acid sequences described define 

genotypes of HCV with respect to four regions of the 
viral genome. Figure 1 depicts schematically the 
organization of HCV. The four regions of particular 
interest are the N55 region, the envelope 1 region, the 

10 5'UT region and the core region. 

The sequences set forth in the present application 
as sequences numbered 1*22 suggest at least five 
genotypes in the NS5 region. Sequences numbered 1-22 
are depicted in Figure 2 as well as the Sequence 

15 Listing. Each sequence numbered 1-22 is derived from 
nucleic acid having 340 nucleotides from the NS5 region. 

The five genotypes are defined by groupings of the 
sequences defined by sequence nxambered 1-22. For 
convenience, in the present application, the different 

20 genotypes will be assigned roman numerals and the 
letter "6". 

The first genotype (GI) is exemplified by 
sequences within sequences numbered 1-6. A second 
genotype (611) is exemplified by sequences within 

25 sequences numbered 7-12. A third genotype (GUI) is 
exemplified by the sec[uences within sequences numbered 
13-17. A fourth genotype (GIV) is exemplified by 
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Beguences within sequences ninnbered 20-22. A fifth 
genotype (^) is exemplified by sequences within 
sequences numbered 18 and 19. 

The sequences set forth in the present application 
5 as sequences numbered 23-32 suggest at least four 

genotypes in the envelope l region of HCV. Sequences 
numbered 23-32 are depicted in Figure 3 as well as in 
the Sequence Listing. Each sequence numbered 23-32 is 
• - derived from nucleic acid having 100 nucleotides from 

10 the envelope 1 region. 

A first envelope 1 genotype group (SI) is 
exemplified by the sequences within the sequences 
numbered 23-25. A second envelope 1 genotype (611) 
region is exemplified by sequences within sequences 

15 numbered 26-28. A third envelope 1 genotype (GUI) is 
exemplified by the sequences within sequences numbered 
32. A foiirth envelope 1 genotype (GIV) is exemplified 
by the sequences within sequence numbered 29-31. 

The sequences set forth in the present application 

20 as sequences numbered 33-51 suggest at least three 
genotypes in the 5*UT region of HCV. Sequences 
numbered 33-51 are depicted in Figure 4 as well as in 
the Sequence Listing. Each sequence numbered 33-51 is 
derived from the nucleic acid having 252 nucleotides 

25 from the 5'UT region, .although sequences 50 and 51 are 
some^at shorter at approximately 180 nucleotides. 
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The first 5*UT genotype (GI) is exemplified by the 
sequences within sequences numbered 33-^38. A second 
5'UT genotype (611) is exeic^lified by the sequences 
within sequences numbered 39-45. A third 5'UT genotype 
(OIII) is exemplified by the sequences within sequences 
numbered 46*47. A fourth 5'UT genotype (6IV) is 
exemplified by sequences • within sequences hximbered 48 
and 49. A fifth 5'UT genotype (GV) is exemplified by 
sequences within sequences ntunbered 50 and 51. 

The sequences numbered 48-62 suggest at least 
three genotypes in the core region of HCV. The 
sequences ntunbered 52-66 are depicted in Figure 5 as 
well as in the Sequence Listing. 

The first core region genotype (GI) is exemplified 
by the sequences within sequences nxombered 52-57. The 
second core region genotype (GII) is exemplified by 
sequences within, sequences numbered 58-64, The third 
core region genotype (GUI) is exemplified by sequences 
within sequences numbered 65 and 66. Sequences 
numbered 52-65 are comprised of 549 nucleotides. 
Sequence numbered 66 is comprised of 510 nucleotides. 

The various genotypes described with respect to 
each region are consistent. That is, HCV having 
features of the first genotype with respect to the MS5 
region will substantially conform to features of the 
first genotype of the envelope 1 region, the 5*UT 
region and the core region. 
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Kucleic acid isolated or synthesized in accordance 
with the sequences set forth in sequence numbers 1-66 
are useful as probes, primers, capture ligands and 
anti-sense agents. As probes, primers, capture ligands 
5 and anti-sense agents, the nucleic acid wil normally 
comprise approximately eight or more nucleotides for 
specificity as well as the ability to form stable 
hybridization products. 

10 Probes 

A nucleic acid isolated or synthesized in 
accordance with a sequence defining a particular 
genotype of a region of the HCV genome can be used as a 
probe to detect such genotype or used in combination 

15 with other nucleic acid probes to detect substantially 
all genotypes of HCV. 

With the sequence information set forth in the 
present application, sequences of eight or more 
nucleotides are identified which provide the desired 

20 inclusivity and exclusivity with respect to various 
genotypes within HCV, and extraneous nucleic acid 
sequences likely to be encountered during hybridization 
conditions . 

Individuals skilled in the art will readily 
25 recognize that the nucleic acid sequences, for use as 
probes, can be provided with a label to facilitate 
detection of a hybridization product. 
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Capture Ligand 

For use as a capture ligand, the nucleic acid 
selected in the manner described above with respect to 
probes, can be readily associated with supports. The 
5 manner in which nucleic acid is associated with 

supports is well known. Mucleic acid having sequences 
corresponding to a sequence within sequences numbered 
1-66 have utility to separate viral nucleic acid of one 
genotype from the nucleic acid of HCV of a different 
10 genotype. Nucleic acid isolated or synthesized in 
accordance with sequences within sequences numbered 
1-66, used in combinations, have utility to capture 
substantially all nucleic acid of all HCV genotypes. 

15 Primers 

Nucleic acid isolated or synthesized in accordance 
with the sequences described herein have utility as 
primers for the an^lification of HCV sequences. With 
respect to polymerase chain reaction (PCR) techniques, 

20 nucleic acid sequences of eight or more nucleotides 
corresponding to one or more sequences of sec[uences 
numbered 1-66 have utility in conjunction with suitable 
enzymes and reagents to create copies of the viral 
nucleic acid. A plurality of primers having different 

25 sequences corresponding to more than one genotype can 
be used to create copies of viral nucleic acid for such 
genotypes . 
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The copies can be used in diagnostic assays to 
detect HCV vims. The copies can also be incorporated 
into cloning and expression vectors to generate 
polypeptides corresponding to the nucleic acid 
5 synthesized by PCR, as will be described in greater 
detail below. 

Anti-sense 

Nucleic acid isolated or synthesized in accordance 
10 with the sequences described herein have utility as 

anti-sense genes to prevent the expression of HCV. 

Nucleic acid corresponding to a genotype of HCV is 

loaded into a suitable carrier such as a liposome for 

introduction into a cell infected with HCV. A nucleic 
15 acid having eight or mote nucleotides is capable of 

binding to viral nucleic acid or viral messenger RUA. 

Preferably, the anti-sense nucleic acid is comprised of 

30 or more nucleotides to provide necessary stability 

of a h^ridization product of viral nucleic acid or 
20 viral messenger SNA. Methods for loading anti-sense 

nucleic acid is known in the art as exemplified by U.S. 

Patent 4,241,046 issued December 23, 1980 to 

Papahadjopoulos et al. 

25 Peptide Synthesis 

Nucleic acid isolated or synthesized in accordance 
with the sequences described herein have utility to 
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generate peptides. The secfuences exemplified by 
sequences nxunbered 1-32 and 52-66 can be cloned into 
suitable vectors or used to isolate nucleic acid. The 
isolated nucleic acid is combined with suitable DNA 
linkers and cloned into a suitable vector. The vector 
can be used to transform a suitable host organism such 
as coli and the peptide encoded by the sequences 
isolated. 

Molecular cloning techniques are described in the 
text Molecular Cloning: A Laboratory Manual . Maniatis 
et al., Coldspring Harbor Laboratory (1982). 

The isolated peptide has utility as an antigenic 
substance for the development of vaccines and 
antibodies directed to the particular genotype of HCV. 

Vaccines and Antibodies 

The peptide materials of the present invention 
have utility for the development of antibodies and 
vaccines. 

The availability of cDNA sequences, or nucleotide 
sequences derived therefrom (including segments and 
modifications of the sequence), permits the 
construction of expression vectors encoding 
antigenically active regions of the peptide encoded in 
either strand. The antigenically active regions may be 
derived from the NS5 region, envelope 1 regions, and 
the core region. 
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Fragments encoding the desired peptides are 
derived from the cDNA clones using conventional 
restriction digestion or by synthetic methods, and are 
ligated into vectors which may, for example, contain 
5 portions of fusion sequences such as beta galactosidase 
or superoxide dismutase (SOD), preferably SOD. Methods 
and vectors which are useful for the production of 
polypeptides ^ich contain fusion sequences of SOD are 
described in European Patent Office Publication number 

10 0196056, published October I, 1986. 

Any desired portion of the HCV cDNA containing an 
open reading frame, in either sense strand, can be 
obtained as a recombinant peptide, such as a mature or 
fusion protein; alternatively, a peptide encoded in the 

15 cDHA can be provided by chemical synthesis. 

The DMA encoding the desired peptide, whether in 
fused or mature form, and whether or not containing a 
signal sequence to permit secretion, may be ligated 
into expression vectors suitable for any convenient 

20 host. Both eukaryotic and prokaryotic host systems are 
presently used in forming recombinant peptides. The 
peptide is then isolated from lysed cells or from the 
culture medium and purified to the extent needed for 
its intended use. Purification may be by techniques 

25 known in the art, for example, differential extraction, 
salt fractionation, chromatography on ion exchange 
resins, affinity chromatography, centrifugation, and 
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the like. See, for example. Methods in Enzymology for 
a variety of methods for purifying proteins. Such 
peptides can be used as diagnostics, or those which 
give rise to neutralizing antibodies may be formulated 
5 into vaccines. Antibodies raised against these 
peptides can also be used as diagnostics, or for 
passive immwotherapy or for isolating and identifying 
HCV. 

An antigenic region of a peptide is generally 

10 relatively small — typically 8 to 10 amino acids or less 
in length. Fragments of as few as 5 amino acids may 
characterize an antigenic region. These segments may 
correspond to NS5 region, envelope l region, and the 
core region of the HCV genome. The 5'UT region is not 

15 known to be translated. Accordingly, using the cDNAs 
of such regions, DNAs encoding short segments of HCV 
peptides corresponding to such regions can be expressed 
recombinantly either as fusion proteins, or as isolated 
peptides. In addition, short amino acid sequences can 

20 be conveniently obtained by chemical synthesis. In 

instances wherein the synthesized peptide is correctly 
configured so as to provide the correct epitope, but is 
too small to be immiinogenic , the peptide may be linked 
to a suitable carrier. 

25 A number of techniques for obtaining such linkage 

are known in the art, including the formation of 
disulfide linkages using N-succinimidyl-3-(2- 
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pyridylthio) propionate (SPDP) and succinimidyl 
4-(N-maleiiaido-inethyl ) cyclohexane-l-carboxylate (SMCC) 
obtained from Pierce Company, Rockford, Illinois, (if 
the peptide lacks a sulfhydryl group, this can be 
5 provided by addition of a cysteine residue) . These 
reagents create a disulfide linkage between themselves 
and peptide cysteine residues on one protein and an 
amide linkage through the epsilon-amino on a lysine, or 
other free amino group in the other. A variety of such 

10 disulfide/amide-forming agents are known. See, for 
example, Immun Rev (1982) 62:185. Other bifunctional 
coupling agents form a thioether rather than a 
disulfide linkage. Many of these thio-ether-forming 
agents are commercially available and include reactive 

15 esters of 6-maleimidocaprioc acid, 2-bromoacetic acid, 
2-iodoacetic acid, A-H-maleimido-methyDcyclohexane-l- 
carboxylic acid, and the like. The carboxyl groups can 
be activated by combining them with succinimide or 
l-hydroxyl-2 nitro-4-sulfonic acid, sodium salt. 

20 Additional methods of coupling antigens employs the 
rotavirus/"binding peptide" system described in EPO 
Pub. No. 259,149, the disclosure of which is 
incorporated herein by reference. The foregoing list 
is not meant to be e:diaustive, and modifications of the 

25 named con^otinds can clearly be used. 

Any carrier may be used which does not itself 
induce the production of antibodies harmful to the 
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host. Suitable carriers are typically large, slowly 
metabolized macromolecules such as proteins; 
polysaccharides, such as latex fiinctionalized 
Sepharose, agarose, cellulose , cellulose beads and the 
like; polymeric amino acids, such as polyglutamic acid, 
poly lysine, and the like; amino acid copolymers; and 
inactive virus particles. Especially useful protein 
substrates are serum albumins, keyhole limpet 
hemocyanin, immomoglobulin molecules, thyroglobulin, 
ovalbumin, tetanus toxoid, and other proteins veil 
known to those skilled in the art. 

Peptides comprising HCV amino acid sequences 
encoding at least one viral epitope derived from the 
NS5, envelope 1, and core region are useful 
immunological reagents. The 5'UT region is not known 
to be translated. For example, peptides comprising 
such trtincated sequences can be used as reagents in an 
immunoassay. These peptides also are candidate subunit 
antigens in compositions for antiser\un production or 
vaccines. While the trioncated sequences can be 
produced by various known treatments of native viral 
protein, it is generally preferred to make synthetic or 
recombinant peptides comprising HCV sequence. Peptides 
comprising these trxmcated HCV sequences can be made up 
entirely of HCV sequences (one or more epitopes, either 
contiguous or noncontiguous), or HCV sequences and 
heterologous sequences in a fusion protein. Useful 
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heterologous segueaces include seguences that provide 
for secretion from a recombinant host, enhance the 
innnunological reactivity of the HCV epitope(B) , or 
facilitate the coupling of the polypeptide to an 

5 innuuaoassay support or a vaccine carrier. See, E.G., 
EPO Pub. HO. 116,201; U.S. Pat. No. 4,722,840; EPO Pub. 
No. 259,149; U.S. Pat. No. 4,629,783. 

The size of peptides comprising the truncated HCV 
seguences can vary widely, the minimum size being a 

10 sequence of sufficient size to provide an HCV epitope, 
while the maximum size is not critical. For 
convenience, the maximum size usually is not 
substantially greater than that required to provide the 
desired HCV epitopes and function(s) of the 

15 heterologous sequence, if any. Typically, the 

truncated HCV amino acid sequence will range from about 
5 to about 100 amino acids in length. More typically, 
however, the HCV sequence will be a maximum of about 50 
amino acids in length, preferably a maximum of about 30 

20 amino acids. It is usually desirable to select HCV 
sequences of at least about 10, 12 or 15 amino acids, 
up to a maximum of about 20 or 25 amino acids. 

HCV amino acid sequences comprising epitopes can 
be identified in a number of ways. For example, the 

25 entire protein sequence corresponding to each of the 
NS5, envelope I, and core regions can be screened by 
preparing a series of short peptides that together span 
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the entire protein sequence of such regions. By 
starting with, for example r peptides of approximately 
100 amino acids, it would be routine to test each 
peptide for the presence of epitope(6) showing a 
5 desired reactivity, and then testing progressively 
smaller and overlapping fragments from an identified 
peptides of 100 amino acids to map the epitope of 
interest. Screening such peptides in an immunoassay is 
within the skill of the art. It is also known to carry 

10 out a computer analysis of a protein sequence to 

identify potential epitopes, and then prepare peptides 
comprising the identified regions for screening. 

The immunogenicity of the epitopes of HCV may also 
be enhanced by preparing them in mammalian or yeast 

15 systems fused with or assembled with particle-forming 
proteins such as, for example, that associated with 
hepatitis B surface antigen. See, e.g. , US 4,722,840. 
Constructs wherein the HCV epitope is linked directly 
to the particle-forming protein coding sequences 

20 produce hybrids which are immunogenic with respect to 
the HCV epitope. In addition, all of the vectors 
prepared include epitopes specific to HBV, having 
various degrees of immunogenicity, such as, for 
example, the pre-S peptide. Thus, particles 

25 constructed from particle forming protein which include 
HCV sequences are immunogenic with respect to HCV^and 
HBV. 
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Hepatitis surface antigen (HBSAg) has been shovm 
to be formed and assembled into particles in Si 
eerevisiae (P. Valenzuela et al. (1982)), as veil as 
in, for example, mammalian cells (P. Valenzuela et al. 
5 1984)). The formation of such particles has been shown 
to enhance the iromunogenicity of the monomer subunit. 
The constructs may also include the immunodominant 
epitope of HBSAg, comprising the 55 amino acids of the 
presurface (pre-S) region. Neurath et al. (1984). 

10 Constructs of the pre-S-HBSAg particle expressible in 
yeast are disclosed in EPO 174,444, published March 19, 
1986; hybrids including heterologous viral sequences 
for yeast expression are disclosed in EPO 175,261, 
published March 26, 1966. These constructs may also be 

15 expressed in mammalian cells such as Chinese hamster 
ovary (CHO) cells using an SV40-dihydrofolate reductase 
vector (Michelle et al. (1984)). 

In addition, portions of the particle-forming 
protein coding sequence may be replaced with codons 

20 encoding an HCV epitope. In this replacement, regions 
which are not required to mediate the aggregation of 
the Tmits to form immunogenic particles in yeast of 
mammals can be deleted, thus eliminating additional HBV 
antigenic sites from competition with the HCV epitope. 

25 

Vaccines 

Vaccines may be prepared from one or more 
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insmunogenic peptides derived from HCV. The observed 
homology between HCV and Flaviviruses provides 
iixformation concerning the peptides which are likely to 
be most effective as vaccines r as well as the regions 
5 of the genome in which they are encoded. 

Multivalent vaccines against HCV may be comprised 
of one or more epitopes from one or more proteins 
derived from the NS5, envelope 1, and core regions. In 
particular, vaccines are contemplated comprising one or 

10 more HCV proteins or subunit antigens derived from the 
NS5f envelope 1, and core regions. The 5'UT region is 
not luiown to be translated. 

The preparation of vaccines which contain an 
immunogenic peptide as an active ingredient, is known 

15 to one skilled in the art. Typically, such vaccines 

are prepared as injectables, either as licpiid solutions 
or suspensions; solid forms suitable for solution in, 
or suspension in, liquid prior to injection may -also be 
prepared. The preparation may also be emulsified # or 

20 the protein encapsulated in liposomes. The active 

immunogenic ingredients are often mixed with excipients 
which are pharmaceutically acceptable and compatible 
with the active ingredient. Suitable excipients are, 
for example, water, saline, dextrose, glycerol, 

25 ethanol, or the like and combinations thereof. In 
addition, if desired, the vaccine may contain minor 
amounts of auxiliary sxibstances such as wetting or 
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emulsifying agents, pH buffering agents, and/or 
adjuvants which enhance the effectiveness of the 
vaccine. Examples of adjuvants which may be effective 
include but are not limited to: aluminum hydroxide. 
5 N-acetyl-muramyl-L-theronyl-D- isoglutamine (thr-MDP), 
H-acetyl-nor-muramyl-L-alanyl- D-isoglutamine (CGP 
11637, referred to as nor-MDP), N- aeetylmuramyl-L- 
alai^l-D-iBoglutaminyl-L-alanine-2-( 1- 2-dipalraitoyl 
-sn-glycero-3^hydroxyphosphoryloxy)- ethylamine (CGP 

10 19835A, referred to as MTP-PE), and RIBI, which 
contains three conponents extracted from bacteria, 
monophosphoryl lipid A, trehalose dimycolate and cell 
wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween 80 
emulsion. The effectiveness of an adjuvant may be 

15 determined by measuring the amount of antibodies 

directed against an immunogenic peptide containing an 
HCV antigenic sequence resulting from administration of 
this peptide in vaccines which are also comprised of 
the various adjuvants. 

20 The vaccines are conventionally administered 

parenterally, by injection, for example, either 
Biibcutaneously or intramuscularly. Additional 
formulations which are suitable for other modes of 
administration include suppositories and, in some 

25 cases, oral formulations. For suppositories, 

traditional binders and carriers may include, for 
example, polyalkylene glycols or triglycerides; such 



f 

SUBSTITUTE SHEET 



wo 92/19743 





PCT/US92/04036 



- 35 - 



10 



15 



20 



suppositories may be formed from mixtures containing 
the active ingredient in the range of 0/5% to 10%, 
preferably l%-2%. Oral formulations include such 
normally employed excipients as, for example, 
pharmaceutical grades of mannitol, lactose, starch, 
magnesixiro stearate, sodiiam saccharine, cellulose, 
magnesium carbonate, and the like. 

The examples below are provided for illustrative 
purposes and are not intended to limit the scope of the 
present invention. 

I. Detection of HCV RNA from Senm 

SNA was extracted from serum using guanidinitom 
salt, phenol and chloroform according to the 
instructions of the kit manufacturer (SNAzol"^ B kit, 
Cinna/Biotecx) . Extracted KNA vas precipitated with 
isopropanol and washed with ethanol. A total of 25 
^1 serum was processed for SNA isolation, and the 
purified liNA was resuspended in 5 ]xl diethyl 
pyrocarbonate treated water for subsequent cDNA 
synthesis. 

II. cDNA Synthesis and Polymerase Chain Reaction (PGR) 
Amplification 

Table 1 lists the sequence and position (with 
reference to HCVl) of all the PCR^ primers and probes 
used in these examples. Letter designations for 
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j-4.«r.<- with 37 C.F.R. SSI. 821- 
nucleotides are consastent witii ^ ^^^^ 

1 825 Thus, the letters A, C Gr »ia 

1.825. inus, adenine, cytosine, guanine, 

the ordinary sense of adenine, ^ or C; R 

^ — ^1 The letter M means a 
thymine, and uracil. The le g or 6; Y means 

5 „eans A or G; W --fl\%^^'J^, ^ or C or G, not 
C or T/U; K means G or T/U ^ ^^^^ ^ ^ 

T/U; H means A or C ' ^ ^ „eans (A or 

T/D, not C; B means C or G or T/U. 

C or G or T/U) or (untoovn or other). Tao 

10 forth below: 

* Table 1 ^ . 

g.x Nucleotide position 
Seq. NO. sequence (5 -3 ) _ ^ 



67 CAAACGTAACACCAACCGRCGCCCACAGG 

,5 68 ACA6AYCCGCAKA6RTCCCCCACG 

69 GCAACCTCGAG6TAGACGTCAGCCTATCCC 509 53 

70 LaACCTCGTGGAAGGCGACAACCTATCCC 509-538 
][ CTCACCAATGATTGCCCTAACTCGAGTATT 9*^^^^ 
72 GTCAC6AACGACTGCTCCAACTCAAG ,375-1402 

20 73 TGGACATGATC6CTGGWGCYCACTGGGG ,3,5.1402 



ACAGAYCCGCSOCAGRTCCCCCACe 
GCAACCTCGAG6TAGACGTCA6CCTATCCC 
GCAACCTCGTG6AAGGC6ACAACCTATCCC 
6TCACCAATGATTGCCCTAACTCGAGTATT 
GTCAC6AAC6ACTGCTCCAACTCAAG 
TGGACATGATC6CTGGWGCYCACTGGGG 
TG6AYAT6GT6GY6G666CYCACTG6GG 

75 AT6ATGAACTGGTCVCCXAC 1453-1428 

76 ACCTTVGCCCAGTTSCCC31CCAT6GA 
AACCCACTCTATGYCCGGYCAT 



6TCACCAATGATTectUi.««wiww«— 
GTCAC6AAC6ACTGCTCCAACTCAAG 
TGGACATGATC6CTGGWGCYCACTGGC 
74 TGGAYATGGTGGYGGGGGCYCACTGGt. i308-l327 
AT6ATGAACTGGTCVCCXAC 



77 AACCCAU.i.w«*«*ww^ 171-188 

,5 78 GAATCGCTGGGGTGACCG 

79 CCATGAATCACTCCCCTGTGAX3GAACTA 30^5^^^ 

TTGCGGGGGCACGCCCAA 



80 
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For cDNA synthesis and PCR ""P^^^^^^^i"^' " 
protocol developed by Per.in-Elmer/Cetus <e«-^^ 
L PCR .it) was used. Both f ^-^^//^ 

specific ----^^^^^^^^ 
^::ron ^uT-e sr r^^^^^^ aLng and fixing 

re:rtl::'co::n:::s, pe.o.«ed in a ^^^^^^^^ 

(«j Research. Inc.). The first strand cDNA synthesis 
reaction was inactivated at 99-C for 5 mxn and then 
cooled at 50-C for 5 nin before adding reaction 
components for subsequent amplification. After an 
::itial 5 cycles of 97-0 for 1 min 50-0 for mxn 
72-C for 3 min, 30 cycles of 94-C for 1 min. 55 C for 
Bin. and 72-C for 3 min followed, and then a final 7 
Bin of elongation at 72*C. 

For the genotyping analysis, sequences 67 and 68 
were used as primers in the PCR reaction. These 

e^lifY a segment corresponding to the core and 
Lelope regions. After an^lif ication. the reaction 
products were separated on an agarose gel and then 
transferred to a nylon membrane. The -n-obil zed 
reaction products were alloved to hybridize with a 
3Vlabelled nucleic acid corresponding to either 
Genotype I (core or envelope 1) or Genotype II (core or 
1). I»-cleic acid corresponding to Genotype 1 
col^rised sequences nun^ered 69 (core). 71 (envelope), 
and 73 (envelope). Nucleic acid corresponding to 
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Genotype II comprised sequences numbered 70 (core), 72 

(envelope), and 74 (envelope). 

The Genotype I probes only hybridized to the 

product amplified from isolates which had Genotype I 
5 sequence. Similarly, Genotype II probes only 

hybridized to the product amplified from isolates which 

had Genotype II sequence. 

In another experiment, PCR products were generated 

using sequences 79 and 80. The products were analyzed 
10 as described above isxcept Sequence No. 73 was used to 

detect Genotype I, Sequence No. 74 was used to detect 

Genotype II, Sequence No. 77 (5'DT) was used to detect 

Genotype III, and Sequence No. 78 (5'UT) was used to 

detect Genotype IV. Each sequence hybridized in a 
15 gaiotype specific manner. 

III. Detection of HCV GI-GIV usin g a sandwich 
hybridization assay for HCV FNA 
An amplified solution phase nucleic acid sandwich 

20 h^ridization assay format is described in this 

example. The assay format employs several nucleic acid 
probes to effect capture and detection. A capture 
probe nucleic acid is capable of associating a 
con^lementary probe bound to a solid support and HCV 

25 nucleic acid to effect capture. A detection probe 

nucleic acid has a first segment (A) that binds to HCV 
nucleic acid and a second segment (B) that hybridizes 
to a second amplifier nucleic acid. 
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The amplifier" nucleic acid has a first segment (B*) 
that hybridizes to segment (B) of the probe nucleic 
acid and also comprises fifteen iterations of a segment 
(C) . segment C of the amplifier nucleic acid is 
5 capable of hybridizing to three labeled nucleic acids. 
Nucleic acid sequences which correspond to 
nucleotide sequences of the envelope 1 gene of Group I 
HCV isolates are set forth in sequences numbered 
Bl-99. Table 2 sets forth the area of the HCV genome 
10 to which the nucleic acid sequences correspond and a 
preferred use of the sequences. 

Table 2 

Probe Type Sequence No. Complement of 

25 Nucleotide Numbers 



25 



Label 


81 


879-911 


Label 


82 


912-944 


Capture 


83 


945-977 


Label 


84 


978-1010 


Label 


85 


1011-1043 


Label 


86 


1044-1076 


Label 


87 


1077-1109 


Capture 


88 


1110-1142 


Label 


89 


1143-1175 



SUBSTITUTE SHEET 



wo 92/19743 



PCTAJS92/04036 



-AG- 



IO 



15 



Probe Type 



Label 

Label 

Label 

Capture 

Label 

Label 

Label 

Label 

Capture 

Label 



Table 2 continued 



Sequence No. 



90 
91 
92 
93 
94 
95 
96 
97 
98 
99 



Con^lement of 
Nucleotide Numbers 



1176- 
1209- 
1242' 
1275- 
1308- 
1341- 
1374- 
1407- 
1440' 
1473 



-1208 
-1241 

=1274 
-1307 
-1340 
-1373 
-1406 
-1439 
-1472 
-1505 



20 



Nucleic acid sequences which correspond to 
nucleotide sequences of the envelope 1 gene ^"^-P 
HC7 isolates are set forth in sequences 100-118. Table 
3 sets forth the area of the HCV genome to which the 
nucleic acid corresponds and the preferred use of the 
se^encGs • 
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Table 3 

Probe Type Sequence No. Complement of 

Nucleotide Numbers 







100 

* V V 


879-911 






101 


912-944 




wop wUf e 


102 

Jl W A 


945-977 








97B— 1010 






104 

X V V 


1011-1043 




Label 


105 


1044— 107o 




Label 


106 


1077-1109 




Capture 


107 


1110-1142 




Label 


108 


1143-1175 


15 


Label 


109 


1176-1208 




Label 


110 


1209-1241 




Label 


111 


1242=1274 




Capture 


112 


1275-1307 




Label 


113 


1308-1340 


20 


Label 


114 


1341-1373 




Label 


115 


1374-1406 




Label 


116 


1407-1439 




Capture 


117 


1440-1472 




Label 


118 


1473-1505 



25 

Nucleic acid sequences which correspond to 
nucleotide sequences in the C gene and the 5'DT region 



r 
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are set forth in seguences 119-145. Table 4 identifies 
the sequence with a preferred use. 

Table 4 

5 

Probe Type Sequence No. 





Capture 


119 




Label 


120 


10 


Label 


121 




Label 


122 




Capture 


123 




Label 


124 




Label 


125 


15 


Label 


126 




Captxire 


127 




Label 


128 




Label 


129 




Label 


130 


20 


Capture 


131 




Label 


132 




Label 


133 




Label 


134 




Label 


135 


25 


Capt\ire 


136 




Label 


137 




Label 


138 
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Table 4 continued 



10 



Probe Type 


Sequence No. 






Label 


139 


Capture 


140 


Label 


141 


Label 


142 


Label 


143 


Capture 


144 


Label 


145 



The detection and capture probe HCV-specif ic 
segments, and their respective names as used in this 
assay were as follows. 
15 Capture sequences are sequences numbered 

119-122 and 141-144. 

Detection sequences are sequences numbered 
119-140. 

Each detection sequence contained, in addition to 
20 the sequences substantially complementary to the HCV 
sequences, a 5' extension (B) which extension (B) is 
complementary to a segment of the second amplifier 
nucleic acid. The extension (B) sequence is identified 
in the sequence Listing as Sequence No. 146. and is 

25 reproduced below. 

AGGCATAGGACCCGTGTCTT 
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15 
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25 



.aC capture se^ence contained in a..ition to 
the sequences substantially complementary to HCV 

^! R seouence complementary to DNA bound to a 
sequences, a sequence bound 
^AA «hRBe The sequence complementary to DNA noun 

«.«nire wauenee. The sequence con>ple»entary to tne 
« support I. set forth « S.,uea=e Ho. 

147 and is reproduced below. 

CTTCTTTGGA6AAAGT66T6 
Kicrotiter plete. v„. prepare. « " 
Kinrolite I senovawell stripe (polystyrene mierotiter 
plates 96 vens/pute, were purohased fro» =ynateoh 

Each veil vas filled with 200 .1 1 H HCl »d 
incuhated at roo» te.^r.ture -^"'^ ^^^f^, 
plates were then washed . f J, ,tre then 
«oiiB aspirated to remove liquid. Tne wbaas 
« d^* «. .1 1 » "'OH and incuhated at roo^ 
t^ratur. for 15-20 .dn. The plates were again 

time, with IX PBS «.d the wells aspirated to 

'"";olydy.. was purchased fro» Si,^ =he»i=als 
:.c. This polypeptide has a 1 = 1 »c lar 
^ an average ».w. of 47,900 g^moU. It has w 
, »f 309 amino acids and contains 155 

ZZ:LT:1 solution of the polypeptide was 

with 2K «aCl/lX PBS to a final concentration of 
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o;l mg/nl (pH 6.0). A volme of 200 ul of this 
sltlon was added to each well. The plate was wrapped 
in plastic to prevent drying and incubated at 30 C 
overnight. The plate was then washed 4 times with IX 
5 PBS and the wells aspirated to remove liquid. 

The following procedure was used to couple the 
nucleic acid, a complementary sequence to Sequence No. 
147, to the plates, hereinafter referred to as 
immohilixed nucleic acid. Synthesis of immobilized 
0 nucleic acid having a sequence complementary to 

sequence No. 133 was described in EPA 883096976. A 
quantity of 20 mg disuccinimidyl suberate was d^^olved 
in 300 pi dimethyl formamide (DMF) . A quantity of 26 
OD units of immobilized nucleic acid was added to 
L5 lOO^l coupling buffer (50 n« sodium phosphate. pH 
7 B) The coupling mixture was then added to the 
DSS-DMF solution and stirred with a magnetic ^tirrer 
for 30 min. An NAP-25 column was equilibrated with 10 
niK sodium phosphate, pH 6.5. The coupling mixture 
20 DSS-DMF solution was added to 2 ml 10 itM sodium 

phosphate. PH 6.5. at 4-C. The mixture was vortexed to 
mix and loaded onto the equilibrated NAP-25 column. 
DSS-activated inonobilized nucleic acid DNA was eluted 
from the column with 3.5 ml 10 itM sodium phosphate. pH 
25 6.5. A quantity of 5.6 OD^eo >«its of eluted 

DSS-activated immobilized nucleic acid DNA was added to 
1500 ml 50 mM sodium phosphate, pH 7.8. A volume of 50 
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ul Of this solution was added to each well and the 
plates were incubated overnight. The plate was then 
washed 4 times with IX PBS and the wells aspirated to 

remove liquid. 

Final stripping of plates was accomplished as 
follows. A volume of 200 yl of 0.2N NaOH containing 
0 5% (w/v) SDS was added to each well. The plate was 
wrapped in plastic and incubated at 65-C for 60 min. 
The plate was then washed 4 times with IX PBS and the 
wells aspirated to remove liquid. The stripped plate 
was stored with desiccant beads at 2-8'C. 

serum samples to be assayed were analyzed using 
PCR followed by sequence analysis to determine the 

'"°'S!ple preparation consisted of delivering 50 .1 
of the serum sample and 150 ul P-K Buffer (2 mg/ml 
proteinase K in 53 mM Tris-HCl, pH 8.0/0.6 H NaCl/0.06 
K sodium citrate/8 mM EDTA, pH 8.0/1.3%SDS/16Hg/ml 
sonicated salmon sperm DNA/7% formamide/50 fmoles 
capture probes/160 fmoles detection probes) to each 
well. Plates were agitated to mix the contents m the 
well, covered and incubated for 16 hr at 62»C. 

After a further 10 minute period at room 
temperature, the contents of each well were aspirated 
to remove all fluid, and the wells washed 2X with 
washing buffer (0.1% SDS/0.015 M NaCl/ 0.0015 M sodium 
citrate). The amplifier nucleic acid was then added to 



SUBSTITUTE SHECT 



# 




wo 92/19743 



PCr/US92/04036 



47 - 



10 



15 



20 



each well (50 yl of 0.7 fmole/|il solution in 0..48 
K NaCl/0.048 M sodium citrate/0.1% SDS/0.5% "blocking 
reagent" (Boehringer Mannheim, catalog No. 1096 176)). 
After covering the plates and agitating to mix the 
contents in the M/ells, the plates were incubated for 30 
min. at 52*C. 

After a further 10 min period at room temperature, 
the wells were washed as described above. 

Alkaline phosphatase label nucleic acid, disclosed 
in EP 883096976, was then added to each well (50 
fxl/well of 2.66 fffloles/til). After incubation at 
52"C for 15 min., and 10 min. at room teir5>erature, the 
wells were washed twice as above and then 3X with 0.015 
M NaCl/0.0015 M sodium citrate. 

An enzyme-triggered dioxetane (Schaap et al., Tet. 
Lett. (1987) 28:1159-1162 and EPA Pub. No. 0254051), 
obtained from Lumigen, Inc., was employed. A quantity 
of 50 \il Lximiphos 530 (Lumigen) was added to each 
well. The wells were tapped lightly so that the 
reagent would fall to the bottom and gently swirled to 
distribute the reagent evenly over the bottom. The 
wells were covered and incubated at 37*C for 20-40 min. 

Plates were then read on a Dynatech ML 1000 
luminometer. Output was given as the full integral of 
the light produced during the reaction. 

The assay positively detected each of the ser\im 
samples, regardless of genotype. 
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^^^ ^f farin g Gen oQrBgB ^ _ 
5Sr;;i^e;^s encoded by a 

Ire expressed as a fusion polypeptide 
sequences 1-66 are expres carrying such 

pSODcfl (Steimer et al. l'^^'^' ^ ^^g^ted with 

First, DNA isolated from pSODcfl is 
First, *«Tiowino linker was ligatea 

into the linear B» «»ted '^J*' 

5 GM CCT OGA MT CTG ATA aSA 

CCT lAA SAC lAI TIT AA 3 
„«r cloning. pUsmid oontainln, the insert « 

'"'Tlt»i* containing t.. ineert i. re-txictea with 
- HCV -Z'TZZTZ. to 

Polypeptide, ate isolated on gels. 

V- SBtiaa i rln' "f ^" yrj^ ,^ formed in Section 

„ . ~ sr^rLl:^c^u.-««- 
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temperature. The pins are removed, vashed in DMF for 5 
minutes, then washed in methanol four times (2 
min/wash) . The pins are allowed to air dry for at 
least 10 minutes, then washed a final time m DMF 
(5Min). l-Hydrox^enzotriazole (HOBt, 367 mg) is 
dissolved in DMF (80 »xL) for use in coupling 
Itaoc-protected polypeptides prepared in Section IV. 

The protected amino acids are placed m 
micro-titer plate wells with HOBt, and the pin block 
placed over the plate, immersing the pins in the 
wells. The assembly is then sealed in a plastic bag 
and allowed to react at 25-C for 18 hours to couple the 
first amino acids to the pins. The block is then 
removed, and the pins washed with D«F (2 min.), MeOH 
(4 X, 2 min.), and again with DMF (2 min.) to clean and 
deprotect the bound amino acids. The procedure is 
repeated for each additional amino acid coupled, until 
all octamers are prepared. 

The free N-termini are then acetylated to 
compensate for the free amide, as most of the epitopes 
are not found at the N-terminus and thus would not have 
the associated positive charge. Acetylation is 
accomplished by filling the wells of a microtiter plate 
with DMF/acetic anhydride/tri ethyl amine (5:2:1 v/v/v) 
and allowing the pins to react in the wells for 90 
minutes at 20-C. The pins are then washed with DMF (2 
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25 



mmi^^.2min.).oxa aU dried for at 

least 10 minutes. «j k« 

The side Chain protecting groups are removed by 
treating the pins vith trif luoroacetlc acid/phenol/ 
ZZ^^ »5 = 2.5 = 1.5. v/vA) in Polf 

*««T^ftT•at•llre The pins are then 
for « hours at room temperature, mo 

washed in dlchloromethane (2 x. 2 nin.), 5» 
di-isopropylethylraine/dichloromethane (2 i, 5 »in.>, 

ttchioro.^. (5 -in-). " " 

routes. The pin. are then washed in water 2 min.K 
MM (1, hours), dried is^suo, «>d stored xn sealed 
plastic bags over silica gel. lv.B.».b SsiS^ 

'^;mer-bearing pins are treated by sonicating for 
30 minutes in a disruption buffer <" "^^^ 
aodecylsulfate, 0.1% 2Hnercaptoethanol. '-^^^^f^J 
Z s7'C The pins are then imersed several times an 
«ter (60-C), followed by boiling HeOH (2 min.), and 

allowed to air dry. 

The pins are then precoated for 1 hour at 25 C in 

sdcrotiter well, containing 2.0 vL """f 
(1% ovalbumin, U BSA, 0.1% Tween. «.d 0.05% Nia xn 
PBS), with agitation. The pins are then immersed an 
^c titer w^U. containing 175 ,L «>tisera """-'^ 
"rl hum«> patients diagnosed as having HCV and allowed 
to incubate at fC overnight. The formation of a 
complex between polyclonal antibodies of the serum and 
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the polypeptide initiates that the peptides give rise 
to an immune response in vivo. Such peptides are 
candidates for the development of vaccines. 

Thus, this invention has been described and 
illustrated. It will be apparent to those skilled in 
the art that many variations and modifications can be 
made without departing from the purview of the appended 
claims and without departing from the teaching and 
scope of the present invention. 



SUBSTITUTE SHEET 



PCT/US92/04036 

WO 92/19743 



- 52 - 



10 



15 



20 



■gB fftTEMCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Tai-An Cha 

(ii) TITLE OF INVENTION: HCV GSIOMIC SEQUENCES 
FOR DIAGNOSTICS AND THERAPEUTICS 

(iii) NUMBER OF SEQUENCES: U7 

(iv) CORRESPONDENCE ADDRESS: 

(A) addressee: Wolf, Greenfield S. Sacks, P.C 

(B) STREET: 600 Atlantic Avenue 

(C) cm: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) ZIP: 02210 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette. 5.25 inch 

(B) COMPUTER: IBM compatible 

(C) OPERATING SYSTEM: MS-DOS Version 3.3 

(D) SOFTWARE: WordPerfect 5.1 
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(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: Not Available 

(B) FILING DATE: Not Available 

(C) CLASSIFICATION: Not Available 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUIIBER: 07/697,326 

(B) FILING DATE: 8 May 1991 
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(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Janiuk, Anthony J. 

(B) REGISTRATION NUMBER: 29,S09 

(C) REFERENCE/DOCKET NUMBER: C0772/7000 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 720-3500 

(B) TELEFAX: (617)720-2441 

(C) TELEX: EZEKIEL 



20 (2) INFORMATION FOR SEQ ID NO: 1: 
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(i) SEQUENCE CHAIU^CTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEBNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE type: DNA 
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(vi) ORIGINAL SOUUCE: 

(C) INDIVIDUAL ISOLATE: ns5i21 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 2 

CTCCACAGTC ACTGAGA6CG ACATCCGTAC GGAGGAGGCA 40 

ATTTACCAAT GTTGTGACCT 6GACCCCCAA 6CCCGCATGQ 80 

CCATCAAGTC CCTCACT6AQ AGGCTTTATG TC6GG6GCCC 120 

TCTTACCAAT TCAA06G6GG AGAACT6CGG CTACCGCAGG 160 

T0CCGC6CGA GC6GCGTACT GACAACTAGC TGTGGTAACA 200 

CCCTCACTTG CTACATCAAO GCCCOGGCAG CCT6TCGAGC 240 

CGCA66GCTC CA6GACT6CA CCATGCTT6T 6TGTG6CGAC 280 

GACTTAGTC6 TTATCTGTGA AA6TGCG6GG GTCCA6GAGG 320 

ACGCG6CGA6 CCTGA6AGCC 3*° 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) individual isolate: nsSptl 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 
CTCCACAGTC ACTGASAGCG ACATCCGTAC GGA6GA6GCA 
ATCTACCAAT GTTGTGATCT GGACCCCCAA GCCCGCGTGG 
CCATCAA6TC CCTCACTGAG AGGCTTTACG TTGGGGGCCC 
5 TCTTACCAAT TCAAGGGGGG AGAACTGCGG CTACCGCAGG 

TGCCGGGCGA GCGGCGTACT GACAACTAGC T6TGGTAATA 200 
CCCTCACTT6 CTACATCAAG GCCCGGGCAG CCTGTCGAGC 
C6CAGGGCTC CG6GACTGCA CCATGCTCGT GTGTGGTGAC 
6ACTTGGTCG TTATCTGTGA GAGTGCGGGG GTCCAG6AGG 
10 ACGCGGC6AG CCTGAGAGCC 

(2> INFORMATION FOR SEQ ID NO: 4 
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(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5gin2 



25 



40 
80 
120 
160 
200 
240 
280 
320 
340 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
CTCTACAGTC ACTGAGAAC6 ACATCCGTAC GGAGGAGGCA 40 
ATTTACCAAT GTTGTGACCT GGACCCCCAA GCCCGCGTGG 



80 
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CCATCAAGTC CCTCACTGAG AGGCTTTATG TTGGGGGCCC 
• c^^T TCAAGGGGGG AAAACTGCGG CTATCGCAGG 

CCCTCACTTG CTACATTAAG 6CCCGGGCAG CCTGTCGAGC 

cg J^c CAGGACTGCA ccatgctcgt gtgtggcgac 
SStagtcg ttatctgtga gagtgcggga gtccaggagg 

ACGC66CGAA CTTGAQA6CC 
(2) INFORMATION FOR SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: micleic acid 

(C) STRMIDEDNESS: single 
(5) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5uBl7 

rxi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 
CTCCACAGTC ACTGAGAGCG ATATCCGTAC GGAGGAGGCA 
ATCTACCAGT GTTGTGACCT 6GACCCCCAA GCCCGCGTGG 
CCATCAAGTC CCTCACC6AG AGGCTTTATG TCGGGGGCCC 



120 
160 
200 
240 
280 
320 
340 



40 
80 
120 



" I^RCCAAT TCAAGGGGGG MAACIGCOS mTCG«G(3 «0 

l^OC^ GC8SC0TACT GACAACTAGC TGIGGTAACA «0 
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CCCTCACTT6 TTACATCAAG 6CCCAAGCAG CCTGTCGAGC 
CGCAGGGCTC CGGGACTGCA CCAT6CTCGT GTGTGGCGAC 
6ACTTAGTCG TTATCTGTGA AAGTCAGGGA GTCCAGGAGG 
AT6CAGCGAA CCTGAGAGCC 

(2) INFORMATIOH FOR SEQ ID HO: 6 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B> TYPE: nucleic acid 

(C) STRAITOEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: PNA 

(vi) ORIGINAL SOURCE: 

(C> INDIVIDUAL ISOLATE: ns5sp2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 



25 



ATCTACCAAT GTTGTGACCT GGACCCCGAA GCCCGTGTGG 
CCATCAAGTC CCTCACTGAG AGGCTTTATG TTGGGGGCCC 
TCTTACCAAT TCAAGGGGGG AGAACTGCGG CTACCGCAGG 
T6CCGCGCAA GCGGC6TACT GACGACTAGC T6T6GTAATA 
CCCTCACTTG TTACATCAA6 GCCCGGGCAG CCTGTCGAGC 
CGCAGGGCTC CAGGACTGCA CCATGCTCGT GTGTGGCGAC 



240 
280 
320 
340 



20 ctCTACAG^'aCTGAGAGCG ATATCCGTAC GGAGGAGGCA 40 

mm-^r^nnnaa^ RrCCGTGTGG 80 

120 
160 
200 
240 
280 
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6ACCTAGTCG TTATCTGCGA AAGTGCGGGG GTCCAGGAGG 320 

340 

ACGCG6C6AG CCTGAGA6CC 
(2) INFOiJMATION FOR SEQ ID NO: 7 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



15 



20 



25 



40 



(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5jl 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
CTCCACAGTC ACTQAGAATG ACACCCGTGT TGAG6AGTCA 

ATTTACCAAT GTTGTGACTT 6GCCCCCGAA GCCAGACAGG 80 

CCATAA6GTC GCTCACAGAG CG6CTCTATG TCGGGGGTCC 120 

TATGACTAAC TCCAAAGGGC AGAACTGCGG CTATCGCCGG 160 

TGCC0C0C6A GC6GCGTGCT GACGACTA6C TGCGGTAATA 200 

CCCTCACATG CTACCTGAAG 6CCACAGCGG CCTGTCGAGC 240 

TGCCAAGCTC CAGGACTGCA CGATGCTCGT GAACGGAGAC 280 

6ACCTTGTCG TTATCTGTGA AA6CGCGGG6 AACCAAGAGG 320 

ACGCGGCAAG CCTACGAGCC ^40 
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lUFOi«ATION FOR SEQ ID NO: 8 

length: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRMSDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECOLEKPE- DHA 

0RI6INM. source: 
(C) INDIVIDUAL ISOLATE: 

MTTfcCCm <»'"''^!t! rsocinACa TCSSOOGCCC 
CCM«6GIC 8CI«««^ clAKGCCGA 

-^^-^ 

j„ CCCiaOIG StCSI 

GACCTTOTCS ITMCIOIGR WKJCGCGSGR « 
MOCGGCGRO CCTACGAGIC 



40 
80 
120 
160 
200 
240 
280 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nsSkl.l 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 
CTCAACGGTC ACCGAGAAT6 ACATCCGTGT T6AGGA6TCA 
ATTTATCAAT 6TTGTGCCTT 66CCCCCGAG GCTAGACAGG 



40 
80 

15 CCATAA6GTC GCTCACAGAG CGGCTTTATA TCGGGGGCCC 120 

CCTGACCAAT TCAAAGGGGC AGAACT6CGG TTATCGCCGG 160 

240 
280 



TGCC6C6CCA GCG6CGTACT GACGACCAGC TQCGGTAATA 
CCCTTACATG TTACTTGAAG GCCTCTGCAQ CCTGTCGAGC 
CGCGAAGCTC CAGGACTGCA CGATGCTCGT GTGTG6GGAC 
20 GACCTTGTCG TTATCTGTGA AAGCGCGG6A ACCCAGGAGG 320 

ACGC6GCGAA CCTACGAGTC 

(2) INFORMATION FOR SEQ ID NO: 10 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 
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( C) STRANDEDNESS : S ing iB 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: USA 

5 

(vi) ORlGlStKL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5gh6 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 10 

10 CTCAACGGTC ACT6AGA6TG ACATCCGTGT CGAGGAGTCG 40 

ATTTACCAAT 6TTGTGACTT GGCCCCCGAA GCCA6GCAGG 80 

CCATAAGGTC GCTCACCGAG CGACTTTATA TCGG6GGCCC 120 

CCTGACTAAT TCAAAAG6GC A6AACTGCG6 TTATC6CCGG 160 

T6CCGC6CGA 6C66C6TGCT GAC6ACTAGC TGCGGTAATA 200 

15 CCCTCACAT6 TTACTTGAAG GCCTCTGCAG CCTGTCGAGC 240 

TGCAAAGCTC CAGGACTGCA CGATGCTCGT GAACGGGGAC 280 

GACCTTGTCG TTATCTGCGA GAGCGCGGGA ACCCAAGAGG 320 

ACGCGGCGAG CCTACGAGTC 3*° 

20 (2) INFORMATION FOR SEQ ID NO: 11 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nsSspl 

^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 

CTCCACAGTC ACTGAGAGTG ACATCCGTGT TGAGGAGTCA 

ATTTACCAAT GTTGTGACTT 6GCCCCCGAA GCCAGACAGG BO 

CTATAAGGTC GCTCACAGAG CGGCTGTACA TCGGGGGTCC 120 

10 CCTGACTAAT TCAAAAGGGC AGAACTGCGG CTATCGCCGG 160 

TGCCGCGCAA GCGGCGTGCT GACGACTAGC TGCGGTAACA 200 

CCCTCACATG TTACTTGAAG GCCTCTGCG6 CCTGTCGAGC 240 

TGCGAAGCTC CAGGACTGCA CGATGCTCGT GTGCGGTGAC 280 

GACCTTGTCG TTATCTGTGA GAGCGCGGGA ACCCAAGAGG 320 

15 AC6CGGC6AG CCTACGAGTC 340 

(2) INFORMATION FOR SEQ ID NO: 12 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 



1 
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(C) individual isolate: ns5sp3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
CTCAACAGTC ACTGAGA6TG ACATCCGTGT TGAGGAGTCA 
ATCTACCAAT GTTGT6ACT GGCCCCCGAA 6CCAGACAGG 
CTATAAGGTC GCTCACAGA6 CGGCTTTACA TCGGGGGTCC 
CCTGACTAAT TCAAAA6GGC AGAACTGCGG CTATCGCCGG 
TGCCGCGCAA GCGGCGTGCT GACGACTAGC TGCGGTAATA 
CCCTCACATG TTACCTGAAG GCCAGTGCGG CCTGTCGAGC 
,0 TGCGAAGCTC CAG6ACT0CA CAATGCTCGT GT6CGGTGAC 280 

GACCTTGTCG TTATCTGT6A GAGCGCGGG6 ACCCAAGA6G 320 
ACGCGGCGAG CCTACGAGTC 



15 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 13 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5lc2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
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10 



15 



20 



CTCAACC6TC ACT6A6AGAG ACATCA6AAC 7GA6GAGTCC 40 

ATATACC6A6 CCTGCTCCCT 6CCTGAG6A6 6CTCACATT6 80 

CCATACACTC 6CT6ACTGAG A6GCTCTACG T6GGA66GCC 120 

CATGTTCAAC A6CAAGGGCC A6ACCT6CGG GTACA6GCGT 160 

TGCCGCGCCA GCGGGGTGCT CACCACTAGC ATGGGGAACA 200 

CCA7CACATG C7A7GTAAAA GCCCTAGCGG CTTGCAAGGC 240 

TGCAGGGATA GTTGCACCCT CAATGCTGGT ATGCGGC6AC 280 

6ACTTAGTTG TCATCTCA6A AAGCCAGGGG ACTGAGGAGG 320 

ACGA6CGGAA CCT6AGAGCT . 340 

(2) INFORMATION FOR 8EQ ID NO: 14 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) 8TRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: SNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nsSargS 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 

CTCTACAGTC ACGTAAAAGG ACATCACATC CTAG6AGTCC 40 

ATCTACCAGT CCTGTTCACT GCCCGA6GAG GCTCGAACTG 80 

CTATACACTC ACTGACTGAG AGACTATACG TAGGGGGGCC 120 
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CATGACAAAC AGCAAGGGCC AATCCTGCGG GTACAGGCGT 
^cSa GCGCAGTGCT CACCACCAGC ATGGGCAACA 
C^SScGTG CTACGTAAAA GCCAGGGCGG CGTGTAACGC 
™gatt GTTGCTCCCA CCATGCTGGT GTGCGGTGAC 
TCATCTCAGA GAGTCAAGGG GCTGAGGAGG 
ACGAGCA6AA CCTGAGAGTC 

(2) INFOWIATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS:^ 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nsSilO 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 15 
CTOJACAGTC ACAGAGAGGG ACATCAGAAC CGAGGAGTCC 
atSatctgt CCTGCTCACT GCCTGAGGAG GCCCGAACTG 
ctSacactc ACTGACTGAG AGACTGTACG taggggggcc 
StGACAAAC AGCAAGGGGC AATCCTGCGG GTACAGGCGT 



160 
200 
240 
280 
320 
340 



40 
80 
120 
160 
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CT»CGT«« GCCAGGGCGG CGTGI.ACGC 240 
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C6C66GCATT 6TT6CTCCCA CCAT6TT66T 6T6CG6CGAC 280 
6ACCT6GTTG TCATCTCAGA 6A6TCAGGGG GTCGAGGAA6 320 



(2) IMFOSMATZON FOR SEQ ID NO: 16 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDKESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5arg6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 



CTCTACAGTC 


ACG6AGAGGG 


ACATCAGAAC 


CGAGGAGTCC 


40 


ATCTATCT6T 


CCTGTTCACT 


6CCTGAGGA6 


GCTCGAACTG 


80 


CCATACACTC ACT6ACTGAG 


A6GCTGTACG 


TAGGGGGGCC 


120 


CAT6ACAAAC AGCAAA6G6C 


AATCCT6CGG 


6TACAGGCGT 


160 


T6CC6C6C6A 


GC6GAGTGCT 


CACCACCAGC ATGGGTAACA 


200 


CACTCAC6TG 


CTACGTGAAA 


6CTAAAGC6G 


CATGTAAC6C 


240 


CGCGGGCATT 


OTTGCCCCCA 


CCAT6TTGGT 


GT6CGGCGAC 


280 


GACCTAGTCG 


TCATCTCAGA 


0A6TCAAGGG 


GTCGA6GA6G 


320 


ATGAGCGAAA 


CCTGAGAGCT 






340 



ATGAGCGGAA CCTGA6AGTC 
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(2) INFORMATION FOR SEQ ID HO: 17 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 340 nucleotides 

(B) TXPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECOLE TXPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nsSMb 

fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
CTCAACCGTC ACGGAGAGG6 ACATAAGAAC AGAAGAATCC 40 

MATATCAGG GTTGTTCCCT 6CCTCAGGAG GCTAGAACTG 80 

160 
200 
240 

20 CCAX^wv- r.n.r,T.GGAGAC 280 

320 
340 



10 



CTATCCACTC GCTCACTGAG AGACTCTACG TAGGAGGGCC 
CATGACAAAC AGCAAGGGAC AATCCTGCGG TTACAGGCGT 
TGCC6CGCCA GCGG6GTCTT CACCACCAGC ATGGGGAATA 
CCAT6ACATG CTACATCAAA GCCCTTGCAG C6TGCAAAGC 
TGCAGGGATC GTGGACCCTA TCATGCTGGT GTGTGGAGAC 
GACCTGGTCG TCATCTCGGA GAGCGAAGGT AACGAGGAGG 
ACGAGCGAAA CCTGAGAGCT 



25 



(2) INFORMATION FOR SEQ ID NO: 18 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRAHDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5sa283 



10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 

CTC6ACCGTT ACCGAACAT6 ACATAATGAC TGAAGAGTCT 40 

ATTTACCAAT CATTGTACTT GCAGCCTGAG GC6CGTGTGG 80 

CAATACGGTC ACTCACCCAA CGCCTGTACT GTGGAG6CCC 120 

15 CATGTATAAC AGCAAG66GC AACAATGTGG TTATCGTAGA 160 

T6CC6CGCCA 6CG6CGTCTT CACCACTAGT ATG6GCAACA 200 

CCATGACGTG CTACATTAAG GCTTTAGCCT CCTGTAGAGC 240 

CGCAAAGCTC CAG6ACTGCA CGCTCCTGGT GT3TGGTGAT 320 

GATAAAGC6A CCTGAGAGCC 340 



20 



(2) INFORMATION FOR SEQ ID NO: 19 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



SUBSTITUTE SHECT 



/ 



PCr/US92/04036 



WO 92/19743 



- 70 - 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGIN?^ SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5Bal56 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 

CTCGACCGTT ACCGAACAT6 ACATAATGAC TGAAGAGTCC 40 

ATTTACCAAT CATT6TACTT GCAGCCTGAG GCAC6CGCGG 80 

CAATACGGTC ACTCACCCAA CGCCT6TACT GTG6AGGCCC 120 

CATGTATAAC AGCAAGG6GC AACAATGTG6 TTACC6TAGA 160 

TGCCGC6CCA 6C6GC6TCTT CACCACCAGT ATGGGCAACA 200 

CCATGACGTG CTACATCAAG GCTTCAGCCG CCTGTAGAGC 240 

TGCAAA6CTC CA6GACTGCA CGCTCCTGGT GTGTGGTGTG 280 

ACCTT6GT66 CCATTTGCGA GAGCCAAGGG ACGCACGA6G 320 



(2) INFORMATION FOR SEQ ID NO: 20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 



5 



15 



AT6AAGCGTG CCTGAGAGTC 
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(C) INDIVIDUAL ISOLATE: nsSill 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
CTCTACTGTC ACTGAACA6G ACATCA66GT 66AAGAG6AG 
ATATACCAGT GCTGTAACCT TGAACCGGAG GCCAGGAAAG 
TGATCTCCTC CCTCAC6GA6 C6GCTTTACT GCGGGGGCCC 
TATGTTCAAC AGCAAG6G6G CCCAGTGTGG TTATCGCCGT 
TGCCGTGCTA GTGGAGTCCT GCCTACCAGC TTCG6CAACA 
CAATCACTTG TTACATCAAG GCTAGAGCGG CTTC6AAGGC 
CGCAGGCCTC CGGAACCCGG ACTTTCTTGT CTGCGGAGAT 
6ATCTGGTC6 TGGTGGCTGA GAGTGATGGC GTCGACGAGG 320 
ATAGAGCAGC CCTGA6AGCC 

(2) INFORMATION FOR SEQ ID NO: 21 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
2Q (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



25 



(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5i4 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
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CTCGACTGTC ACTQAACAG6 ACATCAGGGT GGAA6AGGAG 40 

ATATACCAAT GCTGTAACCT TGAACCGGAG GCCAG6AAAG 80 

TGATCTCCTC CCTCACGGAG CGGCTTTACT GC6GGGGCCC 120 

TATGTTCAAT AGCAAG6GGG CCCAGTGTGG TTATCGCCGT 160 

TGCC6TGCTA 6T0GA6TTCT GCCTACCA6C TTCGGCAACA 200 

CAATCACTT6 TTACATCAA6 GCTAGAGCGG CT6CGAAGGC 240 

CGCAGGGCTC C6GACCCCGG ACTTTCTC6T CTGCGGAGAT 280 

6ATCT66TTG.TG6T6GCTGA GA6T6ATG6C GTCGACGAGG 320 
ATAGAACAGC CCT6CGA6CC 

(2) INFORMATION FOR SEQ ID NO: 22 



(i) SEQUENCE CHASACTERISTICS: 

(A) LENGTH: 340 nucleotides 
j5 (B> TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ns5gh8 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 

25 CTCAACT6TC ACTGAACAGG ACATCAGGGT GGAAGAGGAG 40 

ATATACCAAT GCTGTAACCT TGAACCGGAG 6CCAGGAAAG 80 

TGATCTCCTC CCTCACGGAA CGGCTTTACT GCGGGGGCCC 120 
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160 
200 
240 
2B0 
320 
340 



25 



i;^CWC ASOAGGGGB CCOGIGIGS WM=SCCOT 
8TGGAGTTCI SCCTMCAGC WCGGCMCA 

!^»^TC CGGAACCCGG ACITTCTTGI CTGCGGRGM 
gS^O 'gGTGOCTG. GTCA.T»G3 
ATAGAGCASC CCT6GGA6CC 

(2) INFORMATION FOR SEQ ID »0: 23 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRAHPEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE type: DNA 

(Vi) ORIGINAL SOURCE: (ATCC « 40394) 
(C) INDIVIDUAL ISOLATE: hcvl 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
Iacggcgttg GTAATGGCTC AGCTGCTCCG gatcccacaa 4 

Sttgg acatgatcgc tggtgctcac tggggagtcc ^bo 
tggcgggcat agcgtatttc 

(2)- INFORMATION FOR SEQ ID NO: 24 
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(i) SEQT3ENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) T2PE: nucleic acid 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
10 (C) INDIVIDUAL ISOLATE: US5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 

GACG6C6TTG GTGGTAGCTC AGGTACTCCG GATCCCACAA 40 

GCCATCATGG ACATGATCGC TGGAGCCCAC TGGGGAGTCC 80 

15 TG6C6GGCAT AGC6TATTTC 100 

(2) INFORMATION FOR SEQ ID NO: 25 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
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(C) INDIVIDUAL ISOLATE: AUS5 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 
AACGGCGCTG GTAGTAGCTC AGCT6CTCAG 6GTCCCGCAA 
6CCATCGTG6 ACATGATCGC TGGTGCCCAC TGGGGAGTCC 
TAGCGGGCAT AGCGTATTTT 

(2) INFORMATION FOR SEQ ID NO: 26 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: US4 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
GACAGCCCTA GTGGTATCGC AGTTACTCC6 GATCCCACAA 
GCCGTCATGG ATATGGTGGC 6GGGGCCCAC TGGGGAGTCC 
TGGCGGGCCT TGCCTACTAT 

(2) INFORMATION FOR SEQ ID NO: 27 



40 
80 
100 



40 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: AR62 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
AGCAGCCCTA 6TGGTGTCGC AGTTACTCCG GATCCCACAA 40 
AGCATCGTGG ACAT66TG6C GGGGGCCCAC TGGGGAGTCC 80 
TGGCGGGCCT TGCTTACTAT 



(2) INFORMATION FOR SEQ ID NO: 28 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
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(C> INDIVIDUAL ISOLATE: 115 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 

GOCAGCCCTA 6TGGTGTCGC AGTTACTCCG GATCCCGCAA 40 

5 GCTGTCGT6G ACATGGTGGC GG6G6CCCAC TGG6GAATCC 80 

TAGCG6GTCT TGCCTACTAT 100 

(2) INFORMATION FOR SEQ ID NO: 29 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TY?E: nucleic acid 

(C) 6TRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



20 



25 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: GH8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 

TGTGGGTATG GT6GTGGCGC ACGTCCTGCG TTTGCCCCAG 40 

ACCTTGTTCG ACATAATAOC CGGGGCCCAT TGGGGCATCT 80 

T66C6GGCTT G6CCTATTAC 100 

(2) INFORMATION FOR SEQ ID NO: 30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: DMA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 14 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
TGTG60TATG GTGGTAGCAC ACGTCCTGCG TCTGCCCCAQ 
ACCTT6TTC6 ACATAATA6C CGGGGCCCAT T6GGGCATCT 
TGGCAGGCCT A6CCTATTAC 

INFORMATION FOR SEQ ID NO: 31 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 
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(C) INDIVIDUAL ISOLAIE: 111 

(xi) SEQUENCE DESCailPTION: SEQ ID NO: 31 
TGTOGGTATG GT6GTG6CGC AAGTCCTGCG TTTGCCCCAG 40 
ACCTTGTTCG AC6T6CTA6C CGG66CCCAT TGGGGCATCT 80 
TGGCGGGCCT 6GCCTATTAC 



100 



(2) INFORMATION FOR SEQ ID NO: 32 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 nucleotides 

(B> TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 110 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 
TACCACTATG CTCCTGGCAT ACTTGGTGCG CATCCCGGAG 40 
6TCATCCTG6 ACATTATCAC GGGAGGACAC TGGGGCGTGA 80 
TGTTT6GCCT QGCTTATTTC 

(2) INFORMATION FOR SEQ ID NO: 33 
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(i) . SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOCy: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: (ATCC # 40394) 
(C) INDIVIDUAL ISOLATE: hcvl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 

GTTAQTAT6A GT6TCGTGCA GCCTCCA66A CCCCCCCTCC 40 

CGGGA6A6CC ATA6TGGTCT GCG6AACCGG T6A6TACACC 80 

6GAATT6CCA GGAC6ACCG6 GTCCTTTCTT 66ATCAACCC 120 

GCTCAAT6CC TGGAGATTTG GGCGTGCCCC CGCAAGACTG 160 

CTAGCCGAGT A6TGTTGGGT CGC6AAAGGC CTTGTGGTAC 200 

TGCCTGATA6 G6TGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 
AGACCGTGCA CC 

(2) INFORMATION FOR SEQ ID NO: 34 



252 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



SUBSTITUTE SHEET 



wo 92/19743 PCr/US92/04036 



- 81 - 



40 
80 



(ii) MOLECULE TYPE: DMA 

(vi) ORIGINMi SOURCE: 

(C) INDIVIDUAL ISOLATE: us5 

^ (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 

6TTAGTAT6A GTGTCGTOCA 6CCTCCAGGA CCCCCCCTCC 
CG6GAGAGCC ATAGTG6TCT GC6GAACCGG TGAGTACACC 
GGAATTGCCA GGACGACCGG GTCCTTTCTT GGATCAACCC 120 

10 GCTCAATGCC TGGAGATTT6 GGCGTGCCCC C6CAAGACTG 160 

CTAGCCGAOT AGTGTT66GT CGCGAAAGGC CTTGTGGTAC 200 
TGCCTGATAG G6TGCTTGCG AGTGCCCCGG GAGGTCTCGT 
A6ACCGTGCA CC 

15 (2) INFORMATION FOR SEQ ID NO: 35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TXPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



240 
252 



20 



(ii) MOLECULE TYPE: DNA 

25 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ausl 
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(Xi) SEQUENCE DESCailPTION: SEQ ID NO: 35 

GTTAGTATGA GTGTC6TGCA 6CCTCCAGGA CCCCCCCTCC 40 

CGGGAGAGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

GGAATTGCCA G6ACGACCGG GTCCTTTCTT GGATCAACCC 120 

5 GCTCAATGCC TGGAGATTTG GGCACGCCCC CGCAAGATCA 160 

CTA6CC6AGT A6TGTTGGGT CGC6AAAGGC CTTGTGGTAC 200 

TGCCT6ATAG GGTGCTTGCG AGT6CCCCGG GAGGTCTCGT 240 

252 

AGACCGTGCA CC 
10 (2) INFORMATION FOR SEQ ID NO: 36 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

j5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

20 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: sp2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 

GTTAGTATGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

25 CGGGAGAGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

GGAATTGCCA G6AC6ACCGG GTCCTTTCTT GGATAAACCC 120 

GCTCAATGCC TGGAGATTTG GGCGTGCCCC CGCGAGACTG 160 
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CTA6CCGAGT AGTGTTGQGT CGCGAAAGGC CTTGTGGTAC 200 
T6CCT6ATAG GGT6CTTGC6 AGTGCCCCGG GA6GTCTCGT 240 
AGACCGTGCA CC 252 

5 (2) INFOHMATION FOR SEQ ID NO: 37 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS : single 

(B) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA 

15 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: gm2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 

GTTAGTATGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

20 CGGGAGAGCC ATAGTGGTCT 6CGGAACCGG TGAGTACACC 80 

6GAATT6CCA GGACGACCGG GTCCTTTCTT GGATCAACCC 120 

GCTCAATGCC TGGAGATTTG GGCGTGCCCC CGCAAGACTG 160 

CTA6CCGAGT AGTGTT666T CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATAG 6GTGCTTGC6 AGTGCCCCGG GAGGTCTCGT 240 

25 AGACCGTGCA CC 252 

<2) INFORMATION FOR SEQ ID NO: 38 
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(i) SEQUENCE CHAKACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDMESS : single 
5 (D) lOPOLOOT: linear 

(ii) MOLECDLE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
10 (C) INDIVIDUAL ISOLATE: i21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 

6TTAGTAT6A 6TGTCGTGCA GCCTCCAGGA CCCCCCCTCC 40 

CGG6A6AGCC ATA6T66TCT 6CG6AACCGG TGAGTACACC 80 

15 66AATTGCCA GGACGACCGG GTCCTTTCTT GGATAAACCC 120 

GCTCAATGCC TGGAGATTTG GGCGTGCCCC CGCAAGACTG 160 

CTAGCCGAGT AGT6TTGGGT CGCGAAA66C CTTGTGGTAC 200 

TGCCTGATAG GGTGCTTGCG AGTGCCCC6G GAGGTCTCGT 240 

A6ACCGTGCA CC ^52 



20 



(2> INFORMATION FOR SEQ ID NO: 39 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRAtlDEDNESS : single 
(D> TOPOLOGY: linear 
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(ii) MOLECDLE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: us4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 

GTTAGTATGA GTGTCGTGCA 6CCTCCAGGA CCCCCCCTCC 40 

C6GGAGAGCC ATAGT6GTCT GCGGAACC6G TGAGTACACC 80 

G6AATTGCCA GGACGACCG6 GTCCTTTCTT GGATCAACCC 120 

GCTCAATGCC TGGA6ATTTG G6CGTGCCCC CGCGAGACTG 160 

CTAGCCGAGT AGT6TTGGGT CGCGAAAGGC CTTGTG6TAC 200 

TGCCTGATAG GGTGCTTGCG AGTGCCCCGG 6A6GTCTCGT 240 

AGACCGT6CA CC 252 

(2) INFOKMATION FOR SEQ ID NO: 40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: jhl 



N 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
GTTAGTATGA GTGTCGT6CA GCCTCCAGGA CCCCCCCTCC 
CGGGAGA6CC ATAGTGGTCT GCGGAACCGG TGAGTACACC 
GGAATT6CCA GGAC6ACCGG GTCCTTTCTT GGATCAACCC 120 
5 GCTCAATOCC TGGAGATTTG GGCGTGCCCC CGCGAGACTG 160 

CTAGCCGAGT AGTGTTGGGT CGCGAAAGGC CTT6TGGTAC 200 
TGCCTQATAG GGTGCTTGC6 A6T6CCCCG6 GAGGTCTCGT 240 
— 252 
AGACCGTGCA TC 

10 (2) INFORMATION FOR SEQ ID NO: 41 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TXPE: nucleic acid 
(C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

20 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nac5 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 
GTTAGTATGA GTGTCGTGCA GCCTCCAGGA CCCCCCCTCC 
25 CGGGA6AGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 

GGAATTGCCA GGACGACCGG GTCCTTTCTT GGATCAACCC 120 
GCTCAATGCC TGGAGATTTG GGCGTGCCCC CGCGAGACTG 160 



40 
80 
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CTAGCCGAGT AGTGTTGGGT CGCGAAA6GC CTTGT6GTAC 200 
TGCCT6ATA6 G0T6CTTGC6 A6T6CCCC66 6AGGTCTCGT 240 
AGACCGTGCa CC 252 

5 (2) INFOSKATION FOR SEQ ID NO: 42 

(1) SEQimCE QIABACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STBAMSEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DJIA 

(vi) ORIGINAL S0I3RCE: 
15 (C) INDIVIDUAL ISOLATE: arg2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 

GTTAGTATGA GT6TCGTGCA GCCTCCA6GA CCCCCCCTCC 40 

C6G6AGAGCC ATA6T6GTCT GCGOAACCGG TGAGTACACC 80 

20 GGAATTGCCA GGACGACCGG GTCCTTTCTT 6GATCAACCC 120 

GCTCAATGCC TGGAGATTT6 G6CGTGCCCC CGCGAGACT6 160 

CTAGCCGAGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATA6 GGTOCTTGCG AGTGCCCC6G GA6GTCTCGT 240 

AGACCGTGCA CC 252 



25 



(2) INFOBHATION FOR SEQ ID NO: 43 
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(i) SEQUENCE OiARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DMA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: spl 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 

GTTAGTATGA 6TGTCGTGCA 0CCTCCA6GA CCCCCCCTCC 40 

CG66AGAGCC ATAGTQGTCT GCGGAACC6G TGAGTACACC 80 

GGAATTGCCA GGACGACCGG GTCCTTTCTT 6GATCAACCC 120 

GCTCAATGCC TGGAGATTT6 GGCGTGCCCC CGCGAGACTG 160 

CTAGCC6A6T A6TGTTGGGT C6CGAAAGGC CTTGTGGTAC 200 

TGCCTGATA6 GGTGCTTGC6 AGTGCCCCGG 6AGGTCTCGT 240 
AGACCGTGCA CC 

(2) INFORMATION FOR SEQ ID NO: 44 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



252 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ghl 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
GTTAGTAT6A GTGTC6TGCA GCCTCCAGGA CCCCCCCTCC 40 
CGGGAGA6CC ATAGTGGTCT GCGGAACCGG T6AGTACACC 80 
6GAATTGCCA 6GAC6ACCGG GTCCTTTCTT GGATCAACCC 120 
10 GCTCAATGCC TG6AGATTTG 6GCGTGCCCC CGCGAGACT6 160 

CTAGCCGAGT A6TGTTGGGT C6CGAAAGGC CTT6T66TAC 200 
TGCCTGATAG GGTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 
AGACC6TGCA CC 252 

15 (2) INFORMATION FOR SEQ ID NO: 45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: il5 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 
GTTA6TAT6A GTGTCGTGCA GCCTCCA66A CCCCCCCTCC 
CG6GAGAGCC ATAGT6GTCT 6C6GAACC6G TGAGTACACC 
GGAATTGCCA GGACGACCGG GTCCTTTCTT 6GATCAACCC 
5 GCTCAATGCC TGGAGATTTG G6CGTGCCCC CGC6AGACTG 

CTAGCCGAGT A6TGTTGGQT CGCGAAA6GC CTTGTGGTAC 200 
TGCCTGATAG GGTGCTTGCG A6TGCCCCGG GAGGTCTC6T. 240 

252 

AGACCGTGCA CC 
10 (2) INFORMATION FOR SEQ ID NO: 46 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: DNA 

20 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ilO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 

GCTA6TATCA GTGTCGTACA GCCTCCAG6C CCCCCCCTCC 40 

CGGGAGAGCC ATAGT6GTCT GCGGAACC66 TGAGTACACC 80 

GGAATTGCCG GGAAGACTG6 GTCCTTTCTT GGATAAACCC 120 

ACTCTATOCC C6GCCATTTG 6GCGTGCCCC CGCAA6ACT6 160 

CTAGCCGAGT AGCGTTGGGT TGCGAAAG6C CTTGTGGTAC 200 

TGCCTGATAG GGTGCTTGCG AGTGCCCCGG 6AGGTCTCGT 240 

AGACCGTGCA TC 252 

(2) INFORMATION FOR SEQ ID NO: 47 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) 8TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: arg6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 

GTTAGTATGA 6TCTCGTACA GCCTCCAGGC CCCCCCCTCC 40 

CGGGAGAGCC ATAGTG6TCT GCGGAACCGG TGAGTACACC 80 

GGAATTGCTG GGAAGACTGG GTCCTTTCTT GGATAAACCC 120 

ACTCTATOCC CAGCCATTTG GGCGTGCCCC CGCAAGACTG 160 
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CTAGCCGAGT A6CGTTGGGT TGCGAAAGGC CTTGT66TAC 200 
T6CCTGATAG G6TGCTTGCG AGTGCCCCGG GA6GTCTCGT 240 

252 

A6ACC6TGCA TC 

5 (2) INFOEMATION FOR SEQ IP NO: 48 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B> TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
j5 (c) INDIVIDUAL ISOLATE: s21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 

GTTAGTACGA 6TGTC6TGCA GCCTCCAG6A CTCCCCCTCC 40 

20 CGG6A6AGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

GGAATCGCTG GGGT6ACCGG GTCCTTTCTT GGAGCAACCC 120 

6CTCAATACC CAGAAATTTG G6CGTGCCCC CGCGAGATCA 160 

CTAGCCGAGT AGT6TTGGGT CGC6AAAGGC CTTGTOGTAC 200 

TGCCTGATAG GGTGCTTGCG AGTGCCCCGG GAGGTCTCGT 240 
25 AGACCGTGCA AC 

(2) INFORMATION FOR SEQ ID NO: 49 
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(i) SEQUENCE CHAHACTERISTICS: 

(A) LENGTH: 252 nucleotides 

(B) TYPE: nucleic acid 

(C) 8TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: gj61329 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 

GTTAGTAC6A GTGTCGTGCA GCCTCCA6GA CCCCCCCTCC 40 

CGGGAGAGCC ATAGTGGTCT GCGGAACCGG TGAGTACACC 80 

GGAATCGCTG 6GGTGACCGG GTCCTTTCTT GGAGTAACCC 120 

GCTCAATACC CAGAAATTT6 GGCGTGCCCC CGCGAGATCA 160 

CTAGCCGAGT AGTGTTGGGT CGCGAAAGGC CTTGTGGTAC 200 

TGCCTGATAG GGTGCTTGCG AGTGCCCCGG GA6GTCTCGT 240 

AGACCGTGCA AC 252 

2) INFORMATION FOR SEQ ID NO: 50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 nucleotides 
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(B) TYPE: nucleic acid 

(C) STRMIDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 

(C) INDIVIDURL ISOLATE: sa3 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 

6TTAGTATGA GTGTCGAACA GCCTCCAGGA CCCCCCCTCC 
CGG6AGAGCC ATAGTGGTCT GCG6AACC6G TGAGTACACC 
6GAATT6CCG GGAT6ACCGG GTCCTTTCTT GGATAAACCC 
6CTCAATGCC CGGAGATTT6 6GC6T6CCCC C6CGAGACTG 
CTAGCC6A6T AGTGTTGGGT 

(2) INFORMATION FOR SEQ ID NO: 51 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 
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(C) INDIVIDUAL ISOLATE: Ba4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51 
GTTAGTAT6A GTOTCGAACA GCCTCCA6GA CCCCCCCTCC 40 
CGGGAGAGCC ATAGT66TCT GC6GAACCGG TGAGTACACC 80 

GGAATTGCCG G6ATGACCGG GTCCTTTCTT GGATAAACCC 120 
GCTCAATGCC CGGAGATTTG GGCGTGCCCC C6CGAGACTG 160 
CTA6CCGA6T AGTGTT6GGT 180 

(2) INFORMATION FOR SEQ ID NO: 52 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: (ATCC # 40394) 



(C) 



INDIVIDUAL ISOLATE: hcvl 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 
ATGAGCACGA ATCCTAAACC TCAAAAAAAA AACAAACGTA 
ACACCAACC6 TCGCCCACAG GACGTCAAGT TCCCGGGT6G 
CGGTCA6ATC GTTG6TGGAG TTTACTTGTT 6CCGCGCAGG 
GQCCCTAGAT TGG6T6TGC6 CGCGAC6AGA AAGACTTCCG 
AGCGGTCGCA ACCTCGAGGT AGAC6TCAGC CTATCCCCAA 
GGCTCGTCGG CCCGAGGGCA 6GACCTGGGC TCAGCCCGGG 
TACCCTTGGC CCCTCTATGG CAATGAGGGC TGCGGGTG6G 
CGG6ATGGCT CCTGTCTCCC C6TGGCTCTC GGCCTAGCTG 
G6GCCCCACA GACCCCCGGC GTAGQTCGCG CAATTTGGGT 
AAGGTCATCG ATACCCTTAC GTGCGGCTTC 6CCGACCTCA 
TGGGGTACAT ACC6CTCGTC GGCGCCCCTC TTGGAG6CGC 
TGCCA6GGCC CTGGCGCATG GCGTCCGG6T TCTGGAAGAC 
GGC6TGAACT AT6CAACAGG GAACCTTCCT G6TTGCTCTT 
TCTCTATCTT CCTTCTGGCC CTGCTCTCT 

(2) INFORMATION FOR SEQ ID NO: 53 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



25 (ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 



20 
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(C) INDIVIDDM. ISOLATE: US5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 

ATGA6CAC6A ATCCTAAACC TCAAA6AAAA ACCAAAC6TA 40 

ACACCAACC6 TC6CCCACAG 6ACGTCAAGT TCCCGGGTGG 80 

C6GTCAGATC GTTGGTGGAG TTTACTTGTT 6CC6CGCAGG 120 

6GCCCTAGAT TG6GTGTGC6 CGCGACGAG6 AA6ACTTCCG 160 

AGCGGTCGCA ACCTCGAGGT A6AC6TCAGC CTATCCCCAA 200 

GGCGCGTCGG CCCGAGGGCA GGACCTGGGC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTATGG CAATGAGGGT TGCGGGTGGG 280 

CGG6AT6GCT CCT6TCTCCC CGT6GCTCTC 6GCCTAGTTG 320 

GGGCCCCACA GACCCCCGGC 6TAG6TCGCG CAATTTGG6T 360 

AAGGTCATC6 ATACCCTTAC GTGCGGCTTC GCCGACCACA 400 

TGG6GTACAT ACCGCTCGTC GGCGCCCCTC TTGGAGGCGC 440 

T6CCAGGGCT CT66CGCATG GCGTCCGGGT TCTGGAAGAC 480 

GGCGTGAACT ATGCAACAGG GAACCTTCCT GGTTGCTCTT 520 

TCTCTATCTT CCTTCTGOCC CT6CTCTCT 549 

(2) INTORMATION FOR SEQ ID NO: 54 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: ausl 

(xi> SEQUENCE DESCRIPTION: S: 



TCTCTATCTT CCTTCTGGCC CTTCTCTCT 
(2) INFORMATION FOR SEQ ID NO: 55 



ID NO: 54 




ACCAAAC6TA 


A n 


TCCCG6GT6G 


80 


GCCGCGCAGG 


120 


AAGACTTCCG 


160 


CTATCCCTAA 


200 


TCAGCCCGGG 


240 


TGCGGAT6GG 


280 


6GCCTAGTTG 


320 


CAATXXGGGT 


360 


6CCGACCACA 


400 


TTGGGGGCGC 


440 


TCTGGAAGAC 


480 


6GTT6CTCTT 


520 




549 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: sp2 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55 

ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

ACACCAACCG TCGCCCACAG GAC6TCAAGT TCCC6GGT6G 80 

CGGTCAGATC GTTGGTGGAG TTTACTTGTT GCCGCGCAGG 120 

GGCCCTAGAT TGGGTGTGCG CACGACGAGG AAGACTTCCG 160 

A6C6GTCGCA ACCTC6AGGT AGACGTCAGC CCATCCCCAA 200 

66CTCGTCGA CCC6A6GGCA GGACCTGGGC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTAT6G CAATGAGG6C TGCGGGTGGG 280 

CQGGATGGCT CCT6TCTCCC C6TGGCTCTC GGCCTAGCTG 320 

GGGCCCCACA GACCCCCGGC GTA6GTC6CG CAATTTGGGT 360 

AAGGTCATCG ATACCCTTAC 6TGCGGCTTC GCCGACCTCA 400 

TGGGGTACAT ACCGCTCGTC GGCGCCCCTC TTGGAGGCGC 440 

TGCCAGAGCC CTGGCGCATG GCGTCCGGGT TCT6GAAGAC 480 

6GCGT6AACT ATGCAACAGG GAACCTTCCC GGTTGCTCTT 520 

TCTCTATCTT CCTTCTG6CC CTGCTCTCT 549 

(2) INFORMATION FOR SEQ ID NO: 56 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRWn)EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECai£ TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: gin2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 

ATGAGCAC6A ATCCTAAACC TCAAAGAAGA ACCAAACGTA 40 

ACACCAACCG TCGCCCACAG 6ACGTCAAGT TCCCGGGTGG 80 

CGGTCAGATC GTTGGTGGAG TTTACTTGTT 6CCGCGCAGG 120 

66CCCTAGAT TGGGT6TGCG CGCGAC6AGG AAGACTTCCG 160 

AGCGGTCGCA ACCTCGAGGT AGACGTCAGC CTATCCCCAA 200 

Q6CACGTCQG CCCGAG6GTA 6GACCTGGGC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTATGG CAAT6AG6GT TGC6G6TGGG 280 

C66GATGGCT CCTOTCTCCC CGC6GCTCTC GGCCTAACTG 320 

GGGCCCCACA GACCCCC66C 6TAGGTCGC6 CAATTTGG6T 360 

AAGGTCATCG ATACCCTTAC GTGCGGCTTC 6CCGACCTCA 400 

TGGGGTACAT ACCGCTCGTC 66CGCCCCTC TTGGAG6CGC 440 

TGCCAGGOCC CTGGCGCATG GCGTCCGGGT TCTGGAAGAC 480 

GGCGTGAACT ATGCAACAGG 6AACCTTCCT GGTTGCTCTT 520 

TCTCTATCTT CCTTCTGGCC CTGCTCTCT 549 

25 (2) INFORMATION FOR SEQ ID NO: 57 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 
(vi) ORIGINAL SOURCE: 





(C) 


INDIVIDUAL ISOLATE: 


121 




10 


(xi) SEQUENCE DESCRIPTION: SEQ 


ID NO: 57 






ATGAGCACGA 


ATCCTAAACC 


TCAAAGAAAA ACCAAACGTA 


40 




ACACCAACC6 


TC6CCCACAG 


GACGTCAA6T 


TCCCGGGTGG 


80 




C6GTCAGATC 


6TTG6TGGAG 


TTTACTT6TT 


6CCGCGCAGG 


120 




66CCCTA6AT 


TGGGTGT6C6 


CGCGACGA6G 


AAGACTTCCG 


160 


15 


AGCG6TC6CA 


ACCTCGT6GT 


AGACGCCAGC 


CTATCCCCAA 


200 




6GC6CGTCGG 


CCCGAGGGCA 


GGACCTGGGC TCAGCCCGGG 


240 




TACCCTTGGC 


CCCTCTATGG 


CAAT6A6GGT 


TGCGG6TGGG 


280 




CGGGAT6GCT 


CCTGTCTCCC 


CGTGGCTCTC 


GGCCTAGCTG 


320 




6GGCCCCACA 


GACCCCCGGC 


6TAG6TCGCG 


CAATTTGG6T 


360 


20 


AAGGTCATCG 


ATACCCTTAC 


GTGCGGCTTC 


GCCGACCTCA 


400 




T6GG6TACAT 


ACCGCTCGTC 


6GCGCCCCTC 


TTGGAGGCGC 


440 




TGCCAGGGCC 


CTGGCGCATG 


GCGTCC6GGT 


TCT6GAAGAC 


480 




GGCGT6AACT 


ATGCAACAG6 


GAACCTTCCT 


GGTTGCTCTT 


520 




TTTCTATTTT 


CCTTCTGGCC 


CTGCTCTCT 




549 



25 

(2) INFORMATION FOR SEQ ID NO: 58 



SUBSTITUTE SHEET 



wo 92/19743 



IS 



20 



25 



PCr/US92/04036 



- 102 - 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TXPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE:. us4 

(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 58 

ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

ACACCAACCG CCGCCCACA6 GAC6TTAAGT TCCCG66CGG 80 

TGGCCAGGTC 6TT66T6GAG TTTACCTGTT GCCGCGCAGG 120 

G6CCCCAGGT TGGGTGTGCG CGCGACTAGG AAGACTTCCG 160 

A6CGGTCGCA ACCTCGTGGA AGGCGACAAC CTATCCCCAA 200 

GGCTCGCCA6 CCC6AGGGCA GGGCCTG6GC TCAGCCC6G6 240 

TACCCTTGGC CCCTCTATGG CAATGAG6GT AT6GGGTGGG 280 

CAGGATGGCT CCTGTCACCC CGTGGCTCTC GGCCTAGTT6 320 

GGGCCCCAC6 6ACCCCCGGC GTAGGTCGCG TAATTTGGGT 360 

AAGGTCATCG ATACCCTCAC ATGC6GCTTC GCCGACCTCA 400 

TGGGGTACAT TCCGCTCGTC GOCGCCCCCC TTAGGGGCGC 440 

TOCCAGGGCC TTGGC6CATG GC6TCCGGGT TCTGGAGGAC 480 

G6CGTGAACT ACGCAACAGG GAATCTGCCC GGTTGCTCCT 520 

TTTCTATCTT CCTCTTGGCT CTGCTGTCC 549 
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(2) INFORMATION FOR SEQ ID NO: 59 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TXFE: nucleic acid 

(C) STRANSEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 
(vi) ORIGINAL SOURCE: 





(C) 


INDIVIDUAL ISOLATE: 


jhl 






(xi) SEQUENCE DESCRIPTION: SEQ 


ID NO: 59 




15 


ATGAGCACAA 


ATCCTAAACC 


TCAAA6AAAA 


ACCAAACGTA 


40 




ACACCAACC6 


CC6CCCACA6 


6ACGTCAA6T 


TCCCGGGCGG 


80 




T6GTCAGATC 


6TT6GT66AG 


TTTACCT6TT 


6CCGCGCA6G 


120 




GGCCCCA6GT 


TGG6T6T6C6 


CGC6ACTAGG 


AAGACTTCC6 


160 




AGCGGTCGCA 


ACCTCGTGGA 


AGGCGACAAC 


CTATCCCCAA 


200 


20 


GGCTC6CCA0 


CCC6AGG6CA 


66GCCTGGGC 


TCA6CCCGGG 


240 




TACCCTTG6C 


CCCTCTATGG 


CAACGAGGGT 


ATGGG6TGGG 


280 




CA6GAT6GCT 


CCT6TCACCC 


CGTGGCTCTC 


GGCCTAGTTG 


320 




GGGCCCCACG 


GACCCCCG6C 


GTAGGTCGCG 


TAATTTGGGT 


360 




AAG6TCATC6 


ATACCCTCAC 


ATGCGGCTTC 


GCCGACCTCA 


400 


25 


T6GGGTACAT 


TCCGCTTGTC 


GGCGCCCCCC 


TAGGGGGC6C 


440 




TGCCA66GCC 


CTG6CACAT6 


GT6TCCGGGT 


TCTGGA6GAC 


480 




GGCGT6AACT 


ATGCAACAG6 


GAATTTGCCC 


6GTTGCTCTT 


520 
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TCTCTATCTT CCTCTTGGCT CTGCTGTCC 549 
2) IKF0IIKATIC3N FOR SEQ ID NO: 60 

(i) SEQUENCE CHARACTERISTICS: 

(A) I^GTH: 549 nucleotides 

(B) T^E: nucleic acid 

(C) STRANDEDKESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE T5JPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: nac5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 

AT6AGCACAA ATCCTAAACC CCAAAGAAAA ACCAAACGTA 40 

ACACCAACCG TCGCCCACAG GACGTCAAGT TCCCGGGCGG 80 

TGGTCAGATC 6TTGGTGGAG TTTACCTGTT GCCGCGCAGG 120 

66CCCCAGGT TGGGTGTGCG CGCGACTAGG AAGACTTCCG 160 

AGCG6TCGCA ACCTCGTGGA AGGCGACAAC CTATCCCCAA 200 

6GCTCGCCGG CCCGAGGGCA 6GTCCTGGGC TCAGCCCGGG 240 

TACCCTTGGC CCCTCTATGG CAACGAGGGT ATGGGGTGGG 280 

CAGGATGGCT CCTGTCACCC CGCGGCTCCC GGCCTAGTTG 320 

6GGCCCCACG GACCCCCGGC GTAGGTCGCG TAATTTGGGT 360 

AAGGTCATCG ATACCCTCAC ATGCGGCTTC GCCQACCTCA 400 
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T6666TACAT TCC6CTC6TC 6GC6CCCCCC TA6666GC6C 440 
T6CCA66GCC CT66CACATG GTGTCC6GGT TCT66AGGAC 480 
66CGTGAACT AT6CAACAGQ GAATTT6CCT G6TT6CTCTT 520 



(2) IMFORKATION FOR 6EQ ID NO: 61 

(i) SEQUENCE CHASACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STKANDEDNESS: single 
(S) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: arg2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 

ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

ACACCAACCG CCGCCCACAG GACGTCAAGT TCCCGGGCGG 80 

T6GTCAGATC 6TTGGTGGAG TTTACTTGTT GCCGCGCAGG 120 

OGCCCCAGGT TGGGTGTGCG CGCGACTAGG AAGACTTCCG 160 

AGCGGTCGCA ACCTCGTGGA AGGCGACAAC CTATCCCCAA 200 

G6CTCGCCAG CCCGAGGGTA GGGCCT6GGC TCAGCCC66G 240 

TACCCTT6GC CCCTCTATGG CAATGAGGGT ATG6GGTGGG 280 

CAGGGTGGCT CCTGTCCCCC CGCGGCTCCC GGCCTAGTTG 320 



TCTCTATCTT 



CCTCTT6GCT CTGCT6TCC 



549 
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GGCTCGCC6G CCC6AGGGCA GGGCCTG6GC TCAGCCCGGG 240 

TACCCTT6GC CCCTCTATGG CAATGAGGGT ATGG6GTGGG 280 

CAG6ATG6CT CCT6TCACCC C6TGGTTCTC GGCCTAGTTG 320 

GGGCCCCAC6 GACCCCCG6C GTA6GTCGCG CAATTTGGGT 360 

AAGATCATCG ATACCCTCAC 6TGCGGCTTC 6CC6ACCTCA 400 

TGGG6TACAT TCCGCTCGTC GGCGCCCCCC TAGGGGGCGC 440 

TGCCA6GGCC CTGGCGCATG GCGTCCGGGT TCTGGAGGAC 480 

6GCGT6AACT ATGCAACAGG GAATCTGCCC 6GTTGCTCCT 520 

TTTCTATCTT CCTTCTGGCT TTGCTGTCC 549 

(2) INFORMATION FOR SEQ ID NO: 64 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



25 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(C> INDIVIDUAL ISOLATE: il5 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 

AT6AGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA 40 

ACACCAACC6 CCGCCCACAG GACGTCAAGT TCCCGGGCG6 80 

TG6TCAGATC GTTGGTGGAG TTTACCTGTT GCCGCGCAGG 120 
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66CCCCA6GT T066T6T6C6 CGCGACTA6G AAGACTTCCG 160 

A6CGGTCGCA ACCTCGTGGA AGGCGACAAC CTATCCCCAA 200 

GGCTCGCCAG CCCGAGGGCA GGGCCTGGGC TCAGCCCGGG 240 

TACCCCTG6C CCCTCTAT60 CAATGAGGGT AT6GGGTGG6 280 

5 CAGQATGGCT CCTGTCACCC CGCGGCTCCC GGCCTAGTTG 320 

6GGCCCCAAA GACCCCCGGC GTAGGTCGCG TAATTTGGGT 360 

AA66TCATCG ATACCCTCAC ATGC6GCTTC GCCGACCTCA 400 

TGG6GTACAT TCCGCTCGTC GGCGCCCCCT TA6GGGGCGC 440 

TGCCAGGGCC CTGGCGCATG GCGTCCGGGT TCTGGAGGAC 480 

10 GGCGTGAACT ATGCAACAGG GAATCTACCC GGTTGCTCTT 520 

TCTCTATCTT CCTCTTGGCT TTGCTGTCC 549 

(2) INFOKIIATION FOR SEQ ID NO: 65 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



25 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

<C) INDIVIDUAL ISOLATE: ilO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 
ATGAGCACAA.ATCCTAAACC TCAAA6AAAA ACCAAAA6AA 40 



I 
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(ii) MOLECULE TYPE: DNA 

25 (vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: arg6 



80 
120 
160 
200 
240 



ACACTAACC6 CCGCCCACAG GACGTCAAGT TCCCGG6CGG 
T66CCA6ATC GTTGGCGGAG TATACTTGCT GCCGCGCAGG 
GGCCC6AGAT TGGGT6TGCG CGCGAC6AG6 AAAACTTCCG 
AACGATCCCA 6CCACGCGGA AGGCGTCAGC CCATCCCTAA 
AGATCGTC6C ACCGCTGGCA AGTCCTGGGG AAGGCCAGGA 
TATCCTTGGC CCCTGTATGG GAAT6AGGGT CTCGGCTGGG 280 

CAGG6TGGCT CCTGTCCCCC CGTG6CTCTC GCCCTTCATG 320 

GGGCCCCACT GACCCCCGGC ATAGATCGCG CAACTTGGGT 
AAGGTCATCG ATACCCTAAC GTGCGGTTTT GCCGACCTCA 
TGGGGTACAT TCCCGTCATC GGCGCCCCCG TTGGAGGCGT 
TGCCAGAGCT CTCGCCCACG GAGTGAGGGT TCTGGAGGAT 
GGGGTAAATT ATGCAACAGG GAATTTGCCC 6GTTGCTCTT 
TCTCTATCTT TCTCTTAGCC CTCTTGTCT 

15 (2) INFORMATION FOR SEQ ID NO: 66 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 510 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



360 
400 
440 
480 
520 
549 
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(2) INFORMATION FOR SEQ ID NO: 68 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: .. single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 
ACAGAYCCGC AKAGRTCCCC CACG 

15 (2) INFORMATION FOR SEQ ID NO: 69 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 

CGAACCTCGA GGTAGACGTC AGCCTATCCC 



20 



24 



30 
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(2) INFORMATION FOR SEQ ID NO: 70 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 70 
GCAACCTCGT 6GAA6GC6AC AACCTATCCC 30 

(2) INFORMATION FOR SEQ ID NO: 71 



10 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71 
25 6TCACCAATG ATT6CCCTAA CTCGAGTATT 30 

(2) INFORMATION FOR SEQ ID NO: 72 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 nucleotides 

(B) TYPE: nucleic acid 

5 (c) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(jci) SEQUENCE DESCRIPTION: SEQ ID NO: 72 
6TCACGAAC6 ACTGCTCCAA CTCAAG 

(2) INFORMATION FOR SEQ ID NO: 73 

j5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 nucleotides 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 



20 



25 



26 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73 
TG6ACATGAT CGCTG6WGCY CACTGGGG 28 

(2) INFORMATION FOR SEQ ID NO: 74 
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(i) SEQUENCE CKAIIACTERISTICS : 

(A) LENGTH: 28 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DKA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74 
TG6AYATGGT GGYG66GGCY CACTG6GG 28 

(2) INFORMATION FOR SEQ ID NO: 75 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 nucleotides 

(5) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 
ATGATGAACT GGTCVCCYAC 20 

(2) INFORMATION FOR SEQ ID NO: 76 

(i) SEQUENCE CHARACTERISTICS: 
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(A) 
(B) 
(C) 



LENGTH: 26 nucleotides 
TXPE: nucleic acid 
STRANDEDNESS : s ingle 
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(D) TOPOLOGy: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76 

ACCTTVGCCC AGTTSCCCRC CATGGA 

(2) INFORMATION FOR SEQ ID NO: 77 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 
AACCCACTCT ATGYCCGGYC AT 

(2) INFORMATION FOR SEQ ID NO: 78 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 18 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRANSEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78 
6AATC6CT60 G6TGACC6 18 

(2) INFORMATION FOR SEQ ID NO: 79 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 
20 CCATGAATCA CTCCCCT6TG AGGAACTA 28 

(2) INFORMATION FOR SEQ ID NO: 80 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 18 nucleotides 

(B) TYPE: nucleic acid 

(C) STRAinSEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TTPEi DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80 
TT6C66666C AC6CCCAA 
(2) INFORMATION FOR SEQ ID NO: 81 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81 
YGAAGCGGGC ACAGTCARRC AAGARAGCAG GGC 

(2) INFORMATION FOR SEQ ID NO: 82 

(i) SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 



LENGTH: 33 nucleotides 
TYPE: nucleic acid 
STRANDEDNESS : s ingle 
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TOPOLOGY: linear 



(li) MOLECULE TYPE: SNA 



(Xi) SEQUEITCE DESCRIPTION: SEQ ID NO: 82 
RTARAGCCCY 6WG6AGTTGC GCACTTGGTR 6GC 



33 
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(2) INFORMATION FOR SEQ ID NO: 83 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 
RATACTCGAG TTAGGGCAAT CATTGGT6AC RTG 33 

(2) INFORMATION FOR SEQ ID NO: 84 

(i) SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH: 33 nucleotides 
TYPE: nucleic acid 
STRANDEDNESS : S ingle 
TOPOLOGY: .linear 



s 
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(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84 
A6YRTGCAGG ATGGYATCRK BCGYCTCGTA CAC 

(2) INFORMATION FOR SEQ ID NO: 85 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TXPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(jti) SEQUENCE DESCRIPTION: SEQ ID NO: 85 
GTTRCCCTCR CGAACGCAAG GGACRCACCC CGG 

(2) INFORMATION FOR SEQ ID NO: 86 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid ' 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE- TYPE: DNA 



33 



33 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86 
CGTRGGGGTy AYCGCCACCC AACACCTCGA GRC 33 



5 



(2) INFORMATION FOR SEQ ID NO: 87 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 87 
15 CGTYGY66GG AGTTTGCCRT CCCTGGTGGC YAC 33 

<2) INFORMATION FOR SEQ ID NO: 88 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88 
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CCCGACAAGC AGATCGATGT GACGTCGAA6 CTG 
(2) INFOKMATION FOR SEQ ID NO: 89 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TXPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89 
CCCCAC6TAG ARGGCC6ARC AGAGRGT6GC GCY 

(2) INFORMATION FOR SEQ ID NO: 90 

(i) SEQUENCE CHARACTERISTICS: 
(A> LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90 
YTGRCCGACA A6AAAGACAG ACCCGCAYAR 6TC 



33 



33 



33 
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(2) INFORMATION FOR 8EQ ID NO: 91 

(i) SEQUENCE 01ARACTERIS7ICS: 
5 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 
CGTCCA6T6G YGCCT6GGAG AGAAGGTGAA CAG 33 

15 (2) INFORMATION FOR SEQ ID NO: 92 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

20 (C) STRANDEDNESS: single 

(S) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (xi> SEQUENCE DESCRIPTION: SEQ ID NO: 92 

GCCGGGATAG ATRGARCAAT TGCARYCTTG CGT 33 
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(2) INFORMATION FOR SEQ ID NO: 93 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 nucleotides 
(B> TYPE: nucleic acid 
(C) 5TRANDEDNESS: single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93 
CATATCCCAT 6CCATGCGQT GACCCGTTAY AT6 

(2) INFOSMATION FOR SEQ ID NO: 94 



10 



15 



33 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 
25 YACCAAYGCC 6TCGTAGGGG ACCARTTCAT CAT 33 

(2) INFORMATION FOR SEQ ID NO: 95 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

5 (C) STRAIJDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(zi) SEQUENCE DESCRIPTION: SEQ ID NO: 95 
10 GATGGCTTGT 6GGATCC06A GYASCTGAGC YAY 33 

(2) INFORMATION FOR SEQ ID NO: 96 

(i) SEQUENCE CHARACTERISTICS: 
15 (A)' LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96 
6ACTCCCCA6 TGRGCWCCAG CGATCATRTC CAW 33 

25 (2) INFORMATION FOR SEQ ID NO: 97 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECDLE TYPE: DNA 

(Xi) SEQOESCE DESCRIPTION: SEQ ID NO: 97 

CCCCACCATG GA6AAATACG CTATGCCCGC YA6 

10 (2) INFORMATION FOR SEQ ID NO: 98 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

(5ci) SEQUENCE DESCRIPTION: SEQ ID NO: 98 
TAGYAGCAGY ACTACYARGA CCTTC6CCCA GTT 

(2) INFORMATION FOR SEQ ID NO: 99 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 
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(C) STSANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TXPE: DNA 

(zi) SEQUENCE DESCRIPTION: SEQ ID NO: 99 
GST6ACGTGR GTKTCYGCGT CRACGCCGGC RAA 33 

(2) INFORMATION FOR SEQ ID NO: 100 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) 8TRANDEDNESS : single 

(D) TOPOLOGY: linear . 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100 
6GAAGYTGGG ATGGTYARRC ARGASAGCAR AGO 33 

(2) INFORKATION FOR SEQ ID NO: 101 

(i) SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 



LENGTH: 33 nucleotides 
TYPE: nucleic acid 
STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 

(xi) SEQt3ENCE DESCRIPTION: SEQ ID NO: 101 
6TA2AYYCCG GACRCSTT6C 6CACTTCRTA A6C 
(2) INFORMATION FOR SEQ ID NO: 102 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 
AATRCTTGM6 TTGGAGCART CGTTYGTGAC ATG 

20 (2) INFORMATION FOR SEQ ID NO: 103 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103 
RGYRT6CAT6 ATCAYGTCCG YYGCCTCATA CAC 33 

(2) INFORHATION FOR SEQ ID NO: 104 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) 8TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104 
RTTGTYYTCC CGRACGCARG GCACGCACCC RGG 33 

(2) INFORMATION FOR SEQ ID NO: 105 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNE8S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESC31IPTI0N: SEQ ID NO: 105 
CGT6GGRGTS A6CGCYACCC AGCARCGGGA 6SW 

(2) INFORMATION FOR SEQ ID NO: 106 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106 
15 YGTRGTGGGG AYGCTGKHRT TCCTGGCCGC VAR 

(2) INFORMATION FOR SEQ ID NO: 107 



10 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107 
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CCOiACGAGC AARTC6ACRT GRC6TCGTAW TGT 33 
(2) INFORMATION FOR SEQ ID NO: 108 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108 
YCCCACGTAC ATAGCSGAMS AGARRGYAGC CGY 33 

(2) INFORMATION FOR SEQ ID NO: 109 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 
20 (B) TYPE: nucleic acid 

(C) ETRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109 
CTGGGAGAYR AGRAAAACAG ATCC6CARAG RTC 33 
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(2) INFOSMATICMT FOR SEQ ID NO: 110 

(i) SEQXmCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TXPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 

(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 110 
Y6TCTCRTGC C66CCAGSB6 AGAAGGT6AA YAG 

IS (2) INFORMATION FOR SEQ ID NO: 111 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111 

GCCGGGATA6 AKKGAGCART TGCAKTCCTG YAC 33 
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(2) INFORMATION FOR SEQ ID NO: 112 

(i) SEQI3ENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANSEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112 
CATATCCCAA GCCATRCGRT GGCCTGAYAC CT6 

(2) INFORMATION FOR SEQ ID NO: 113 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANPEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 113 
CACTARGGCT GYYGTRGGYG ACCAGTTCAT CAT 33 

(2) INFORMATION FOR SEQ ID NO: 114 
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(i) SEQUENCE C3»RACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) T2PE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114 
GACRGCTTGT GGGATCCGGA GTAACTGCGA YAC 

(2) INFORMATION FOR SEQ ID NO: 115 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115 



33 



GACTCCCCAG TGRGCCCCC6 CCACCATRTC CAT 33 
25 (2) INFORMATION FOR SEQ ID NO: 116 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECDLE TYPE: 3JNA 

(xi) SEQI3ENCE DESCRIPTION: SEQ ID NO: 116 

SCCCACCATO 0AWMAGTA6G CAAG6CCCGC YA6 

(2) INFORMATION FOR SEQ ID NO: 117 

(i) SEQt]ENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117 
GAGTA6CATC ACAATCAADA CCTTAGCCCA GTT 

(2) INFORMATION FOR SEQ ID NO: 118 

(i) SEQUENCE CHARACTERISTICS: 



(A) 
(B) 



LENGTH: 33 nucleotides 
TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: DMA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118 
• Y6WCRYGXR6 GTRTKCCCGT CAACGCC6GC AAA 33 

(2) INFORMATION FOR SEQ ID NO: 119 



(i) SEQI3ENCE CHARACTERISTICS: 

(A) LENGTH; 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
j5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119 
TCCTCACAGG GGAGTGATTC AT6GTGGAGT GTC 33 

(2) INFORMATION FOR SEQ ID NO: 120 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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TOPOLOGY: linear 



(ii) MOLECOLE TYPE: SKA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120 
AT0GCTA6AC 6CTTTCT6C6 TGAAGACAGT AGT 
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(2) INFORMATION FOR SEQ ID NO: 121 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121 
GCCTGGA6GC T6CACGRCAC TCATACTAAC 6CC 33 

(2) INFORMATION FOR SEQ ID NO: 122 

(i) SEQUENCE CHARACTERISTICS: 



(A) 
(B> 
(C) 
(D) 



LENGTH: 33 nucleotides 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(,i) SEQUENCE description: SEQ ID NO: 122 

ScIgaccac tatggctct. ccgggagggg ggg 

(2) INFOSMATION FOR SEQ ID NO: 123 
^ m SEQUENCE CHARACTHIISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single.. 
(5) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(3ci) SEQUENCE DESCRIPTION: SEQ ID NO: 123 
tStCCYGGC AATTCCGGTG TACTCACCGG TTC 

(2) INFORMATION FOR SEQ ID NO: 124 

( i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCa: DESCailPTION: SEQ ID NO: 124 
GCATIGAGCG GGTTDATCCA AGAAAGGACC CG6 

(2) INFORMATION FOR SEQ ID NO: 125 



(i) SEQUENCE CHARACTERISTICS: 

(A) -LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear . 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125 
AGCAGTCTYG CGGG6GCACG CCCAARTCTC CAG 

(2) INFORMATION FOR SEQ ID NO: 126 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126 
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ACAA6GCCTT TCGCGACCCA ACACTACTCG 6CT 
(2) INFORMATION FOR SEQ II> NO: 127 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEHSTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D> TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127 
6G66CACTCG CAA6CACCCT ATCAGGCAGT ACC 33 

(2) INFORMATION FOR SEQ ID NO: 128 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRWTOEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) 



MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128 
YGTGCTCATG RTGCACGGTC TACGAGACCT CCC 
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(2) INFORMATION FOR SEQ ID NO: 129 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129 
GTTACGTTTG KTTYTTYTTT GRGGTTTRGG AWT 33 

(2) INFORMATION FOR SEQ ID NO: 130 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TXPE: DHA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130 
C6GGAACTTR AC6TCCTGTG G6CGRCGGTT G6T 33 

(2) INFORMATION FOR SEQ ID NO: 131 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid , 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131 
CARGTAAACT CCACCRACGA TCTGRCCRCC RCC 33 

(2) INFORMATION FOR SEQ ID NO: 132 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132 
RC6CACACCC AAYCTRGGGC CCCTGCGCGG CAA 

5 (2) INFORMATION FOR SEQ ID NO: 133 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) T^E: nucleic acid 

IQ (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii> MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133 
15 AGGTTGCGAC C6CTC6GAAG TCTTYCTRGT CGC 

(2) INFORMATION FOR SEQ ID NO: 134 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 33 nucleotides 



(B) 
(C) 
(D) 



TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 



25 



(ii> 



MOLECULE TYPE: DNA 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 134 
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RCSHRCCTT6 6G6ATAG6CT GACGTCWACC TCG 
(2) INFORMATION FOR SEQ ID NO: 135 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) T^E: nucleic acid 

(C) STRANDEDNESS : single 
(P) TOPOLOGY: linear 



33 



(ii) MOLECDLE TXPE: DNA 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135 
RCGHRCCTTG GGGATAGGTT GTC6CCWTCC AC6 33 

15 (2) INFORMATION FOR SEQ ID NO: 136 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 nucleotides 
(B> TYPE: nucleic acid 

20" (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136 

YCCRGGCTGR GCCCAGRYCC TRCCCTCGGR YYG 33 
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C2) INFORMATION FOR SSQ ID NO: 137 

(i) SEQUENCE C3IARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

( C ) STHANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TXPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137 
BSHRCCCTCR TTRCCRTAGA GGGGCCADGG RTA 33 

(2) INFORMATION FOR SEQ ID NO: 138 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138 
25 GCCRCGGGGW GACAGGA6CC ATCCYGCCCA CCC 33 

(2) INFORMATION FOR SEQ ID NO: 139 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

5 (C> STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: MIA 

3^0 (xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 139 

CCGGGGGTCY GTG6GGCCCC AYCTAG6CCG RGA 33 
(2) INFORMATION FOR SEQ ID NO: 140 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140 
ATC6ATGACC TTACCCAART TRCGCGACCT RCG 33 

25 (2) INFORMATION FOR SEQ ID NO: 141 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 33 nucleotides 

(B) TY?E: nucleic acid 

(C) STRAMDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: OKA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141 
CCCCATGAGR TCGGCGAAGC CGCAYGTRAG GGT 33 

(2) INFORMATION FOR SEQ ID NO: 142 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142 
GCCYCCWARR 66GGC6CCGA CGA6CGGWAT RTA 33 

(2) INFORMATION FOR SEQ ID NO: 143 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECDLE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143 

AACCCG6ACR CCRTOYGCCA RGGCCCTGGC AGC 

(2) INFORMATION FOR SEQ ID NO: 144 
(i) SEQI3ENCE CHARACTERISTICS: 



(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144 
RTTCCCTGTT GCATA6TTCA CGCCGTCYTC CAG 



<A) 
(B) 
(C) 
(D) 



LENGTH: 33 nucleotides 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 



15 



20 



(2) INFORMATION FOR SEQ ID NO: 145 



25 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) KOLECDLE TYPE: DNA 

(zi) SEQUENCE DESCRIPTION: SEQ ID NO: 145 
5 CAKSA66AA6 AKA6AGAAA6 A6CAACCR66 MAR 33 

(2) lUFOSKAHGm FOR SEQ ID NO: 146 

(i) SEQUENCE CHARACTERISTICS: 
10 .(A) LENGTH: 20 nucleoti4e8 

(B) Tin?E: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION.: SEQ ID NO: 146 
AGGCATAGGA CCCGTGTCTT 20 

20 (2) INFORMATION FOR SEQ ID NO: 147 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 nucleotides 

(B) TYPE: nucleic acid 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147 
CTTCTTTGGA GAAAGTGGTG 20 
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QAIMS 



1 AS a composition of matter, a aon-naturally 
occurring nucleic acid having a non-HCV-1 nucleotide 
5 sequence of eight or more nucleotides corresponding to 
a nucleotide sequence within the hepatitis C virus 
genome . 

2. The composition of claim 1 wherein said nucleotide 
10 sequence corresponding to a non-HCV-1 nucleotide 
sequence within the hepatitis C virus genome is 
selected from the regions consisting of the NS5 region, 
envelope 1 region. 5'UT region, and the core region. 

15 3. The composition of claim 1 wherein said nucleotide 
sequence corresponding to a non-HCV-l nucleotide 
sequence within the hepatitis C virus gencme 
corresponds to a sequence in the NS5 region. 

20 4. The composition of claim 3 wherein said nucleotide 
sequence corresponding to a non-HCV-1 sequence within 
the hepatitis C virus genome is selected from a 
sequence within sequences numbered 2-22. 
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5. The composition :of claim 1 wherein said nucleotide 
sequence corresponding to a non-HCV-l nucleotide 
sequence within the hepatitis C virus genome 
corresponds to a sequence in the envelope 1 region. 

6. The composition of claim 5 wherein said nucleotide 
sequence corresponding to a non-HCV-1 sequence within 
the hepatitis C virus genome corresponds to a sequence 
within sequence numbers 24-32. 

7. The composition of claim 1 wherein at least one 
sequence corresponding to a non-HCV-1 nucleotide 
sequence within the hepatitis C virus genome 
corresponds to a sequence in the 5*UT region. 

8. The composition of claim 7 wherein said nucleotide 
sequence corresponding to a non-HCV-1 sequence within 
the hepatitis C virus genome corresponds to a sequence 
within sequences numbered 34-51 • 

9. The composition of claim 1 wherein said nucleotide 
sequence corresponding to a non-HCV-l nucleotide 
sequence within the hepatitis C virus genome 
corresponds to a sequence in the core region. 
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10. The composition of claim 9 wherein said nucleotide 
sequence corresponding to a non-HCV-1 sequence within 
the hepatitis C virus genome corresponds to a within 
sequences numbered 53-66. 

11. The composition of claim 1 vdierein said 
non-naturally occurring nucleic acid has a nucleotide 
sequence corresponding to one or more genotypes of 
hepatitis C virus. 



12. The composition of claim 11 wherein said 
non-naturally occurring nucleic acid has a sequence 
corresponding to a sequence of a first genotype which 
first genotype is defined substantially by sequences 

15 numbered 1-6 in the NS5 region, 23-25 in the envelope 1 
region, 33-38 in the 5'UT region, and 52-57 in the core 
region. 

13. The composition of claim 11 wherein said 

20 non-naturally occurring nucleic acid has a sequence 

corresponding to a sequence of a second genotype which 
second genotype is defined substantially by sequences 
numbered 7-12 in the MS5 region, 26-28 in the envelope 
1 region, 39-45 in the 5'UT region, and 58-64 in the 

25 core region. 
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14. The composition of claim 11 wherein said 
non-natural ly occurring nucleic acid has a sequence 
corresponding to a sequence of a third genotype which 
third genotype is defined substantially by sequences 
numbered 13*17 in the NS5 region, 32 in the envelope 1 
region, 46-47 in the 5'UT region and 65-66 in the core 
region. 

15. The composition of claim 11 wherein said 
non-naturally occurring nucleic acid. has a sequence 
corresponding to a sequence of a fourth genotype which 
fourth genotype is defined siibst ant i ally by sec[uences 
numbered 20-22 in the NS5 region, 29-31 in the envelope 
1 region and 48-49 in the 5'UT region. 

16. The composition of claim 11 wherein said 
non-naturally occurring nucleic acid has a sequence 
corresponding to a sequence of a fifth genotype which 
fifth genotype is defined substantially by sequences 
numbered 18-19 in the NS5 region and 50-51 in the 5'UT 
region. 

17. The composition of claim' 1 wherein said 
non-naturally occurring nucleic acid is capable of 
priming a reaction for the synthesis of nucleic acid to 
form a nucleic acid having a nucleotide sequence 
corresponding to hepatitis C virus. 



SUBSTITUTE SHEET 



wo 92/19743 



PCr/US92/04036 



- 154 - 



18. The composition of claim 1 v^ierein said 
non-naturally occurring nucleic acid has label means 
for detecting a hybridization product. 

5 19. The composition of claim 1 wherein said 

non-naturally occurring nucleic acid has support means 
for separating a hybridization product from solution. 

20. The composition of claim 1 wherein said 

10 non-naturally occurring nucleic acid prevents the 
transcription or translation of viral nucleic acid. 

21. A method of forming a hybridization product with a 
hepatitis C virus nucleic acid comprising the following 

15 steps: 

a. placing a non-naturally occurring nucleic 

acid having a nucleotide sequence of eight or 
more nucleotides corresponding to a non-HCV-1 
sequence in the hepatitis C viral genome into 

20 conditions in which hybridization conditions 

can be imposed said non-naturally occurring 
nucleic acid capable of forming a 
hybridization product with said hepatitis C 
virus nucleic acid under hybridization 

25 conditions; and 
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b. imposing hybridization conditions to form a 
hybridization product in the presence of 
hepatitis C virus nucleic acid. 

5 22. The method of claim 21 wherein said nucleotide 
sec[uence corresponding to a non-HCV-1 sequence in the 
hepatitis C virus genome corresponds to a sequence 
within at least one of the regions consisting 
essentially of NS5 region, envelope 1 region, 5'UT 
10 region, and the core region. 

23. The method of claim 21 wherein said nucleotide 
sequence corresponds to a non-HCV-l sequence 
corresponds to a sequence within the NS5 region. 



15 



20 



24. The method of claim 23 wherein said nucleotide 
sequence corresponds to a non-HCV-l sequence 
corresponds to a sequence within sequences numbered 
2-22. 

25. The method of claim 21 wherein said nucleotide 
sequence corresponds to a non-HCV-l sequence 
corresponds to a sequence within the envelope 1 region. 
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26. The method of claim 25 wherein said nucleotide 
sequence corresponds to a non-HCV-1 sequence is 
selected from a sequence within sequences numbered 

24-32. 

^ 27. The method of claim 21 wherein said nucleotide 
sequence corresponds to a non-HCV-1 sequence 
corresponding to a sequence within the S'UT region. 

10 28. The method of claim 27 wherein said nucleotide 
sequence corresponds to a non-HCV-1 sequence selected 
from a sequence within sequences numbered 34-51. 

29. The method of claim 21 wherein said nucleotide 
15 sequence corresponds to a non-HCV-1 sequence 

corresponding to a sequence within the core region. 

30. The method of claim 29 wherein said nucleotide 
sequence corresponds to a non-HCV-1 sequence selected 

20 from a sequence within sequences numbered 53-66. 

31. The method of claim 21 wherein said nucleotide 
sequence corresponds to a non-HCV-1 nucleotide sequence 
corresponding to one or more genotypes of hepatitis C 

25 virus . 
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32. The method of claim 21 wherein said non-natural ly 
occurring nucleic acid has a sequence corresponding to 
a sequence of a first genotype which first genotype is 
defined svibstantially by sequences numbered 1-6 in the 

5 NS5 region, 23-25 in the envelope 1 region, 33-38 in 
the 5'UT region, and 52-57 in the core region. 

33. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 

10 a sequence of a second genotype which. second genotype 
is defined substantially by sequences numbered 7-12 in 
the NS5 region, 26-28 in the envelope 1 region, 39-45 
in the 5'UT region, and 58-64 in the core region. 

15 34. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a third genotype which third genotype is 
defined substantially by sequences numbered 13-17 in 
the NS5 region, 32 in the envelope l region, 46-47 in 

20 the 5'UT region and 65-66 in the core region. 

35. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a fourth genotype which fourth genotype 
25 is defined substantially by sequences numbered 20-22 in 
the NS5 region, 29-31 in the envelope 1 region and 
48-49 in the 5'TJT region. 
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36. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a fifth genotype which fifth genotype as 
defined substantially by sequences numbered 18-19 m 
the NS5 region and 50-51 in the 5' UT region. 

37. The method of claim 21 wherein said hybridization 
product is capable of priming a reaction for the 
synthesis of nucleic acid. 

38. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has label means for detecting a 
hybridization product. 

15 39. The method of claim 21 wherein said non-naturally 
occurring nucleic acid has support means for separating 
the hybridization product from solution. 

40. The method of claim 21 wherein said non-naturally 
20 occurring nucleic acid prevents the transcription or 

translation of viral nucleic acid. 

41. As a composition of matter, a non-naturally 
occurring polypeptide corresponding to a non-HCV-1 

25 nucleotide sequence of nine or more nucleotides which 
sequence of nine or more nucleotides corresponds to a 
sequence within hepatitis C virus genomic sequences. 
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42. The composition of claim 41 wherein said non-HCV-l 
sequence is selected from one of the regions consisting 
of NS5 region, envelope 1 region, and the core region. 

5 43. The composition of claim 41 wherein said non-HCV-l 
nucleotide sequence corresponds to a sequence in the 
NS5 region. 

44. The composition of claim 43 wherein said non-HCV-1 
10 sequence is selected from a sequence within sec[uences 

numbered 2-22. 

45. The composition of claim 41 wherein said non-HCV-l 
sequence corresponds to a sec[uence in the envelope 1 

15 region. 

46. The composition of claim 45 wherein said non-HCV-l 
sequence is selected from a sequence within sequences 
ntunbered 24-32. 



47. The composition of claim 41 wherein said non^-HCV-l 
sequence corresponds to a sequence in the core region. 

48. The composition of claim 47 wherein said non-HCV-l 
25 sequence is selected from a sequence within sequences 

numbered 52-66. 



20 
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49. The composition of claim 41 wherein said non-HCV-1 
nucleotide sequence has a nucleotide sequence 
corresponding to one or more genotypes of hepatitis C 
virus . 

^ 50. The composition of claim 41 wherein said non-HCV-l 
nucleotide sequence has a sequence corresponding to a 
sequence of a first genotype which first genotype is 
defined substantially by sequences numbered 1-6 in the 

10 NS5 region, 23-25 in the envelope l region, and 52-57 
in the core region. 

51. The contposition of claim 41 wherein said non-HCV-1 
nucleotide sequence has a sequence corresponding to a 
15 sequence of a second genotype which second genotype is 
defined substantially by sequences numbered 7-12 in the 
NS5 region, 26-28 in the envelope 1 region, and 58-64 
in the core region. 

20 52. The composition of claim 41 wherein said non-HCV-1 
nucleotide sequence has a sequence corresponding to a 
sequence of a third genotype which third genotype is 
defined substantially by sequences numbered 13-17 in 
the NS5 region, 32 in the envelope 1 region, and 65-66 

25 in the core region. 
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53. The composition of claim 41 wherein said non-HCV-l 
nucleotide sequence has a sequence corresponding to a 
sequence of a fourth genotype vhich fourth genotype is 
defined substantially by sequences numbered 20-22 in 
the HS5 region, 29-31 in the envelope 1 region and 
48-49 in the 5'UT region. 

54. The composition of claim 41 wherein said non-HCV-l 
nucleotide sequence has a sequence corresponding to a 
sequence of a fifth genotype which fifth genotype is 
defined substantially by sequences numbered 18-19 in 
the NS5 region and 50-51 in the 5'UT region. 

55. The cozc^osition of claim 41 wherein said 
polypeptide is capable of generating an immune reaction 
in a host. 

56. An antibody capable of selectively binding to the 
composition of claim 41. 

57. A method of detecting one or more genotypes of 
hepatitis C virus con^rising the following steps: 

a) placing a non-naturally occurring nucleic acid 
having a nucleotide sequence of eight or more 
nucleotides corresponding to one or more genotypes 
of hepatitis C virus under conditions where 
hybridization conditions can be imposed. 
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b) inipoBing hybridization conditions to form a 
hybridiaation product in the presence of hepatitis 
C virus nucleic acid; and 

c) monitoring the non-naturally occurring nucleic 
5 acid for the formation of a hybridization product, 

which hybridization product is indicative of the 
presence of the genotype of hepatitis C virus. 

58. The method of claim 57 wherein said non-naturally 
10 occurring nucleic acid has a sequence corresponding to 

a sequence of a first genotype which first genotype is 
defined substantially by sequences numbered 1-6 in the 
NS5 region, 23-25 in the envelope 1 region, 33-38 in 
the S'UT region, and 52-57 in the core region. 

59. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a second genotype which second genotype 
is defined substantially by sequences numbered 7-12 in 

20 the NS5 region, 26-28 in the envelope 1 region, 39-45 
in the 5'XJT region, and 58-64 in the core region. 
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60. The method of claim 57 wherein said non-naturally 
occiirring nucleic acid has a sequence corresponding to 
a sequence of a third genotype which third genotype is 
defined sxabstantially by secpiences numbered 13-17 in 
the NS5 region, 32 in the envelope 1 region, 46-47 in 
the 5 'ITT region and 65-66 in the core region. 

61. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a fourth genotype which fourth genotype 
is defined substantially by sequences numbered 20-22 in 
the NS5 region, 29-31 in the envelope 1 region and 
48-49 in the S'UT region. 

62. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence of a fifth genotype which fifth genotype is 
defined siabstantially by sequences, numbered 18-19 in 
the NS5 region. 

63. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence numbered 67-145. 
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64. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence numbered 69, 71, 73 and 81-99 to identify 
Group I genotypes in the core and region of the HCV 

5 genome. 

65. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 
a sequence numbered 70, 72. 70 and 100-118 to identify 

10 Group II genotypes in the core and envelope regions of 
the HCV genome. 

66. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence corresponding to 

15 a sequence numbered 77 to identify Group III genotypes 
in the 5* UT region of the HCV genome. 

67. The method of claim 57 wherein said non-naturally 
occurring nucleic acid has a sequence numbered 79 to 

20 identify Group IV genotypes in the 5' UT region of the 
HCV genome. 
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